


default search action
ICSLP 1998: Sydney, Australia
- The 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November - 4th December 1998. ISCA 1998
- Graeme M. Clark:
Cochlear implants in the second and third millennia. - Stephanie Seneff:
The use of linguistic hierarchies in speech understanding.
Text-To-Speech Synthesis 1-6
- Paul C. Bagshaw:
Unsupervised training of phone duration and energy models for text-to-speech synthesis. - Jerome R. Bellegarda, Kim E. A. Silverman:
Improved duration modeling of English phonemes using a root sinusoidal transformation. - Chilin Shih, Wentao Gu, Jan P. H. van Santen:
Efficient adaptation of TTS duration model to new speakers. - Takayoshi Yoshimura, Keiichi Tokuda, Takashi Masuko, Takao Kobayashi, Tadashi Kitamura:
Duration modeling for HMM-based speech synthesis. - Cameron S. Fordyce, Mari Ostendorf:
Prosody prediction for speech synthesis using transformational rule-based learning. - Susan Fitt, Stephen Isard:
Representing the environments for phonological processes in an accent-independent lexicon for synthesis of English. - Daniel Faulkner, Charles Bryant:
Efficient lexical retrieval for English text-to-speech synthesis. - Robert E. Donovan, Ellen Eide:
The IBM trainable speech synthesis system. - Sarah Hawkins, Jill House, Mark A. Huckvale, John Local, Richard Ogden:
Prosynth: an integrated prosodic approach to device-independent, natural-sounding speech synthesis. - Jialu Zhang, Shiwei Dong, Ge Yu:
Total quality evaluation of speech synthesis systems. - Gerit P. Sonntag, Thomas Portele:
Comparative evaluation of synthetic prosody with the PURR method. - Richard Sproat, Andrew J. Hunt, Mari Ostendorf, Paul Taylor, Alan W. Black, Kevin A. Lenzo, Mike Edgington:
SABLE: a standard for TTS markup. - H. Timothy Bunnell, Steven R. Hoskins, Debra Yarrington:
Prosodic vs. segmental contributions to naturalness in a diphone synthesizer. - Alex Acero:
A mixed-excitation frequency domain model for time-scale pitch-scale modification of speech. - Masami Akamine, Takehiko Kagoshima:
Analytic generation of synthesis units by closed loop training for totally speaker driven text to speech system (TOS drive TTS). - Martti Vainio, Toomas Altosaar:
Modeling the microprosody of pitch and loudness for speech synthesis with neural networks. - David T. Chappell, John H. L. Hansen:
Spectral smoothing for concatenative speech synthesis. - Aimin Chen, Saeed Vaseghi, Charles Ho:
MIMIC : a voice-adaptive phonetic-tree speech synthesiser. - Je Hun Jeon, Sunhwa Cha, Minhwa Chung, Jun Park, Kyuwoong Hwang:
Automatic generation of Korean pronunciation variants by multistage applications of phonological rules. - Stephen Cox, Richard Brady, Peter Jackson:
Techniques for accurate automatic annotation of speech waveforms. - Andrew Cronk, Michael W. Macon:
Optimized stopping criteria for tree-based unit selection in concatenative synthesis. - Stéphanie de Tournemire:
Automatic transcription of intonation using an identified prosodic alphabet. - Ignasi Esquerra, Albert Febrer, Climent Nadeu:
Frequency analysis of phonetic units for concatenative synthesis in catalan. - Alex Chengyu Fang
, Jill House, Mark A. Huckvale:
Investigating the syntactic characteristics of English tone units. - Antonio Bonafonte, Ignasi Esquerra, Albert Febrer, José A. R. Fonollosa, Francesc Vallverdú:
The UPC text-to-speech system for Spanish and catalan. - Attila Ferencz, István Nagy, Tunde-Csilla Kovács, Maria Ferencz, Teodora Ratiu:
The new version of the ROMVOX text-to-speech synthesis system based on a hybrid time domain-LPC synthesis technique. - Takehiko Kagoshima, Masahiro Morita, Shigenobu Seto, Masami Akamine:
An F0 contour control model for totally speaker driven text to speech system. - Keikichi Hirose, Hiromichi Kawanami:
On the relationship of speech rates with prosodic units in dialogue speech. - Esther Klabbers, Raymond N. J. Veldhuis:
On the reduction of concatenation artefacts in diphone synthesis. - Chih-Chung Kuo, Kun-Yuan Ma:
Error analysis and confidence measure of Chinese word segmentation. - Jungchul Lee, Donggyu Kang, Sanghoon Kim, Koengmo Sung:
Energy contour generation for a sentence using a neural network learning method. - Yong-Ju Lee, Sook-Hyang Lee, Jong-Jin Kim, Hyun-Ju Ko, Young-Il Kim, Sanghun Kim, Jung-Cheol Lee:
A computational algorithm for F0 contour generation in Korean developed with prosodically labeled databases using k-toBI system. - Kevin A. Lenzo, Christopher Hogan, Jeffrey Allen:
Rapid-deployment text-to-speech in the DIPLOMAT system. - Robert H. Mannell:
Formant diphone parameter extraction utilising a labelled single-speaker database. - Osamu Mizuno, Shin'ya Nakajima:
A new synthetic speech/sound control language. - Ryo Mochizuki, Yasuhiko Arai, Takashi Honda:
A study on the natural-sounding Japanese phonetic word synthesis by using the VCV-balanced word database that consists of the words uttered forcibly in two types of pitch accent. - Vincent Pagel, Kevin A. Lenzo, Alan W. Black:
Letter to sound rules for accented lexicon compression. - Ze'ev Roth, Judith Rosenhouse:
A name announcement algorithm with memory size and computational power constraints. - Frédérique Sannier, Rabia Belrhali, Véronique Aubergé:
How a French TTS system can describe loanwords. - Tomaz Sef, Ales Dobnikar, Matjaz Gams:
Improvements in slovene text-to-speech synthesis. - Shigenobu Seto, Masahiro Morita, Takehiko Kagoshima, Masami Akamine:
Automatic rule generation for linguistic features analysis using inductive learning technique: linguistic features analysis in TOS drive TTS system. - Yoshinori Shiga, Hiroshi Matsuura, Tsuneo Nitta:
Segmental duration control based on an articulatory model. - Evelyne Tzoukermann:
Text analysis for the bell labs French text-to-speech system. - Jennifer J. Venditti, Jan P. H. van Santen:
Modeling vowel duration for Japanese text-to-speech synthesis. - Ren-Hua Wang, Qingfeng Liu, Yongsheng Teng, Deyu Xia:
Towards a Chinese text-to-speech system with higher naturalness. - Andrew P. Breen, Peter Jackson:
A phonologically motivated method of selecting non-uniform units. - Steve Pearson, Nick Kibre, Nancy Niedzielski:
A synthesis method based on concatenation of demisyllables and a residual excited vocal tract model. - Ann K. Syrdal, Alistair Conkie, Yannis Stylianou:
Exploration of acoustic correlates in speaker selection for concatenative synthesis. - Johan Wouters, Michael W. Macon:
A perceptual evaluation of distance measures for concatenative speech synthesis. - Mike Plumpe, Alex Acero, Hsiao-Wuen Hon, Xuedong Huang:
HMM-based smoothing for concatenative speech synthesis. - Martin Holzapfel, Nick Campbell:
A nonlinear unit selection strategy for concatenative speech synthesis based on syllable level features. - Robert Eklund, Anders Lindström:
How to handle "foreign" sounds in Swedish text-to-speech conversion: approaching the 'xenophone' problem. - Nick Campbell:
Multi-lingual concatenative speech synthesis. - Takashi Saito:
On the use of F0 features in automatic segmentation for speech synthesis. - Atsuhiro Sakurai, Takashi Natsume, Keikichi Hirose:
A linguistic and prosodic database for data-driven Japanese TTS synthesis. - Alexander Kain, Michael W. Macon:
Text-to-speech voice adaptation from sparse training data. - Gregor Möhler:
Describing intonation with a parametric model.
Spoken Language Models and Dialog 1-5
- Joakim Gustafson, Patrik Elmberg, Rolf Carlson, Arne Jönsson:
An educational dialogue system with a user controllable dialogue manager. - Klaus Failenschmid, J. H. Simon Thornton:
End-user driven dialogue system design: the reward experience. - Yi-Chung Lin, Tung-Hui Chiang, Huei-Ming Wang, Chung-Ming Peng, Chao-Huang Chang:
The design of a multi-domain Mandarin Chinese spoken dialogue system. - Kallirroi Georgila, Anastasios Tsopanoglou, Nikos Fakotakis, George Kokkinakis:
An integrated dialogue system for the automation of call centre services. - Kuansan Wang:
An event driven model for dialogue systems. - Cosmin Popovici, Paolo Baggia, Pietro Laface, Loreta Moisa:
Automatic classification of dialogue contexts for dialogue predictions. - Ganesh N. Ramaswamy, Jan Kleindienst:
Automatic identification of command boundaries in a conversational natural language user interface. - Massimo Poesio, Andrei Mikheev:
The predictive power of game structure in dialogue act recognition: experimental results using maximum entropy estimation. - Paul C. Constantinides, Scott Hansma, Chris Tchou, Alexander I. Rudnicky:
A schema based approach to dialog control. - Gregory Aist:
Expanding a time-sensitive conversational architecture for turn-taking to handle content-driven interruption. - Marc Swerts, Hanae Koiso, Atsushi Shimojima, Yasuhiro Katagiri:
On different functions of repetitive utterances. - Hiroaki Noguchi, Yasuharu Den:
Prosody-based detection of the context of backchannel responses. - Lena Strömbäck, Arne Jönsson:
Robust interpretation for spoken dialogue systems. - Yohei Okato, Keiji Kato, Mikio Yamamoto, Shuichi Itahashi:
System-user interaction and response strategy in spoken dialogue system. - Noriko Suzuki, Kazuo Ishii, Michio Okada:
Organizing self-motivated dialogue with autonomous creatures. - Gerhard Hanrieder, Paul Heisterkamp, Thomas Brey:
Fly with the EAGLES: evaluation of the "ACCeSS" spoken language dialogue system. - Maria Aretoulaki, Stefan Harbeck, Florian Gallwitz, Elmar Nöth, Heinrich Niemann, Jozef Ivanecký, Ivo Ipsic, Nikola Pavesic, Václav Matousek:
SQEL: a multilingual and multifunctional dialogue system. - Stefan Kaspar, Achim G. Hoffmann:
Semi-automated incremental prototyping of spoken dialog systems. - Peter A. Heeman, Michael Johnston, Justin Denney, Edward C. Kaiser:
Beyond structured dialogues: factoring out grounding. - Masahiro Araki, Shuji Doshita:
A robust dialogue model for spoken dialogue processing. - Tom Brøndsted, Bo Nygaard Bai, Jesper Østergaard Olsen:
The REWARD service creation environment. an overview. - Matthew Bull, Matthew P. Aylett:
An analysis of the timing of turn-taking in a corpus of goal-oriented dialogue. - Sarah Davies, Massimo Poesio:
The provision of corrective feedback in a spoken dialogue CALL system. - Laurence Devillers, Hélène Bonneau-Maynard:
Evaluation of dialog strategies for a tourist information retrieval system. - Sadaoki Furui, Koh'ichiro Yamaguchi:
Designing a multimodal dialogue system for information retrieval. - Dinghua Guan, Min Chu, Quan Zhang, Jian Liu, Xiangdong Zhang:
The research project of man-computer dialogue system in Chinese. - Kate S. Hone, David Golightly:
Interfaces for speech recognition systems: the impact of vocabulary constraints and syntax on performance. - Tatsuya Iwase, Nigel Ward:
Pacing spoken directions to suit the listener. - Annika Flycht-Eriksson, Arne Jönsson:
A spoken dialogue system utilizing spatial information. - Candace A. Kamm, Diane J. Litman, Marilyn A. Walker:
From novice to expert: the effect of tutorials on user expertise with spoken dialogue systems. - Takeshi Kawabata:
Emergent computational dialogue management architecture for task-oriented spoken dialogue systems. - Tadahiko Kumamoto, Akira Ito:
An analysis of dialogues with our dialogue system through a WWW page. - Michael F. McTear:
Modelling spoken dialogues with state transition diagrams: experiences with the CSLU toolkit. - Michio Okada, Noriko Suzuki, Jacques M. B. Terken:
Situated dialogue coordination for spoken dialogue systems. - Xavier Pouteau, Luis Arévalo:
Robust spoken dialogue systems for consumer products: a concrete application. - Daniel Willett, Arno Romer, Jörg Rottland, Gerhard Rigoll:
A German dialogue system for scheduling dates and meetings by naturally spoken continuous speech. - Chung-Hsien Wu, Gwo-Lang Yan, Chien-Liang Lin:
Spoken dialogue system using corpus-based hidden Markov model. - Peter J. Wyard, Gavin E. Churcher:
A realistic wizard of oz simulation of a multimodal spoken language system. - Yen-Ju Yang, Lin-Shan Lee:
A syllable-based Chinese spoken dialogue system for telephone directory services primarily trained with a corpus. - Hiroyuki Yano, Akira Ito:
How disagreement expressions are used in cooperative tasks.
Prosody and Emotion 1-6
- Phil Rose:
Tones of a tridialectal: acoustic and perceptual data on ten linguistic tonetic contrasts between lao, nyo and standard Thai. - Napier Guy Ian Thompson:
Tone sandhi between complex tones in a seven-tone southern Thai dialect. - Alexander Robertson Coupe:
The acoustic and perceptual features of tone in the tibeto-burman language ao naga. - Phil Rose:
The differential status of semivowels in the acoustic phonetic realisation of tone. - Kai Alter, Karsten Steinhauer, Angela D. Friederici:
De-accentuation: linguistic environments and prosodic realizations. - N. Amir, S. Ron:
Towards an automatic classification of emotions in speech. - Marc Schröder, Véronique Aubergé, Marie-Agnès Cathiard:
Can we hear smile? - Matthew P. Aylett, Matthew Bull:
The automatic marking of prominence in spontaneous speech using duration and part of speech information. - JongDeuk Kim, SeongJoon Baek, Myung-Jin Bae:
On a pitch alteration technique in excited cepstral spectrum for high quality TTS. - Jan Buckow, Anton Batliner, Richard Huber, Elmar Nöth, Volker Warnke, Heinrich Niemann:
Dovetailing of acoustics and prosody in spontaneous speech recognition. - Janet E. Cahn:
A computational memory and processing model for prosody. - Belinda Collins:
Convergence of fundamental frequencies in conversation: if it happens, does it matter? - Hiroya Fujisaki, Sumio Ohno, Takashi Yagi, Takeshi Ono:
Analysis and interpretation of fundamental frequency contours of british English in terms of a command-response model. - Frode Holm, Kazue Hata:
Common patterns in word level prosody. - Yasuo Horiuchi, Akira Ichikawa:
Prosodic structure in Japanese spontaneous speech. - Shunichi Ishihara:
An acoustic-phonetic description of word tone in kagoshima Japanese. - Koji Iwano, Keikichi Hirose:
Representing prosodic words using statistical models of moraic transition of fundamental frequency contours of Japanese. - Tae-Yeoub Jang, Minsuck Song, Kiyeong Lee:
Disambiguation of Korean utterances using automatic intonation recognition. - Oliver Jokisch, Diane Hirschfeld, Matthias Eichner, Rüdiger Hoffmann:
Multi-level rhythm control for speech synthesis using hybrid data driven and rule-based approaches. - Jiangping Kong:
EGG model of ditoneme in Mandarin. - Geetha Krishnan, Wayne H. Ward:
Temporal organization of speech for normal and fast rates. - Haruo Kubozono:
A syllable-based generalization of Japanese accentuation. - Hyuck-Joon Lee:
Non-adjacent segmental effects in tonal realization of accentual phrase in seoul Korean. - Eduardo López, Javier Caminero, Ismael Cortázar, Luis A. Hernández Gómez:
Improvement on connected numbers recognition using prosodic information. - Kazuaki Maeda, Jennifer J. Venditti:
Phonetic investigation of boundary pitch movements in Japanese. - Kikuo Maekawa:
Phonetic and phonological characteristics of paralinguistic information in spoken Japanese. - Arman Maghbouleh:
ToBI accent type recognition. - Hansjörg Mixdorff, Hiroya Fujisaki:
The influence of syllable structure on the timing of intonational events in German. - Osamu Mizuno, Shin'ya Nakajima:
New prosodic control rules for expressive synthetic speech. - Mitsuru Nakai, Hiroshi Shimodaira:
The use of F0 reliability function for prosodic command analysis on F0 contour generation model. - Sumio Ohno, Hiroya Fujisaki, Hideyuki Taguchi:
Analysis of effects of lexical accent, syntax, and global speech rate upon the local speech rate. - Sumio Ohno, Hiroya Fujisaki, Yoshikazu Hara:
On the effects of speech rate upon parameters of the command-response model for the fundamental frequency contours of speech. - Thomas Portele, Barbara Heuft:
The maximum-based description of F0 contours and its application to English. - Thomas Portele:
Perceived prominence and acoustic parameters in american English. - Erhard Rank, Hannes Pirker:
Generating emotional speech with a concatenative synthesizer. - Albert Rilliard, Véronique Aubergé:
A perceptive measure of pure prosody linguistic functions with reiterant sentences. - Kazuhito Koike, Hirotaka Suzuki, Hiroaki Saito:
Prosodic parameters in emotional speech. - Barbertje M. Streefkerk, Louis C. W. Pols, Louis ten Bosch:
Automatic detection of prominence (as defined by listeners' judgements) in read aloud dutch sentences. - Masafumi Tamoto, Takeshi Kawabata:
A schema for illocutionary act identification with prosodic feature. - Wataru Tsukahara:
An algorithm for choosing Japanese acknowledgments using prosodic cues and context. - Chao Wang, Stephanie Seneff:
A study of tones and tempo in continuous Mandarin digit strings and their application in telephone quality speech recognition. - Sandra P. Whiteside:
Simulated emotions: an acoustic study of voice and perturbation measures. - Jinsong Zhang, Keikichi Hirose:
A robust tone recognition method of Chinese based on sub-syllabic F0 contours. - Xiaonong Sean Zhu:
The microprosodics of tone sandhi in shanghai disyllabic compounds. - Natalija Bolfan-Stosic, Tatjana Prizl:
Jitter and shimmer differences between pathological voices of school children. - Xiaonong Sean Zhu:
What spreads, and how? tonal rightward spreading on shanghai disyllabic compounds. - Sean Zhu, Phil Rose:
Tonal complexity as a dialectal feature: 25 different citation tones from four zhejiang wu dialects. - Juan Manuel Montero, Juana M. Gutiérrez-Arriola, Sira E. Palazuelos, Emilia Enríquez, Santiago Aguilera, José Manuel Pardo:
Emotional speech synthesis: from speech database to TTS. - Cécile Pereira, Catherine I. Watson:
Some acoustic characteristics of emotion. - Marc Swerts:
Intonative structure as a determinant of word order variation in dutch verbal endgroups. - Johanneke Caspers:
Experiments on the meaning of two pitch accent types: the 'pointed hat' versus the accent-lending fall in dutch. - Sun-Ah Jun, Hyuck-Joon Lee:
Phonetic and phonological markers of contrastive focus in Korean. - Emiel Krahmer, Marc Swerts:
Reconciling two competing views on contrastiveness. - Paul Taylor:
The tilt intonation model. - Hiroya Fujisaki, Sumio Ohno, Seiji Yamada:
Analysis of occurrence of pauses and their durations in Japanese text reading. - Estelle Campione, Jean Véronis:
A statistical study of pitch target points in five languages. - Fabrice Malfrère, Thierry Dutoit, Piet Mertens:
Fully automatic prosody generator for text-to-speech. - Halewijn Vereecken, Jean-Pierre Martens, Cynthia Grover, Justin Fackrell, Bert Van Coile:
Automatic prosodic labeling of 6 languages. - Helen Wright:
Automatic utterance type detection using suprasegmental features. - Ee Ling Low, Esther Grabe:
A contrastive study of lexical stress placement in singapore English and british English. - Florian Gallwitz, Anton Batliner, Jan Buckow, Richard Huber, Heinrich Niemann, Elmar Nöth:
Integrated recognition of words and phrase boundaries. - Amalia Arvaniti:
Phrase accents revisited: comparative evidence from standard and cypriot greek. - Grzegorz Dogil, Gregor Möhler:
Phonetic invariance and phonological stability: lithuanian pitch accents. - Christel Brindöpke, Gernot A. Fink, Franz Kummert, Gerhard Sagerer:
A HMM-based recognition system for perceptive relevant pitch movements of spontaneous German speech. - Jean Véronis, Estelle Campione:
Towards a reversible symbolic coding of intonation.
Hidden Markov Model Techniques 1-3
- Xiaoqiang Luo, Frederick Jelinek:
Nonreciprocal data sharing in estimating HMM parameters. - Jeff A. Bilmes:
Data-driven extensions to HMM statistical dependencies. - Jiping Sun, Li Deng:
Use of high-level linguistic constraints for constructing feature-based phonological model in speech recognition. - Steven C. Lee, James R. Glass:
Real-time probabilistic segmentation for segment-based speech recognition. - Guillaume Gravier, Marc Sigelle, Gérard Chollet:
Toward Markov random field modeling of speech. - Rukmini Iyer, Herbert Gish, Man-Hung Siu, George Zavaliagkos, Spyros Matsoukas:
Hidden Markov models for trajectory modeling. - Katsura Aizawa, Chieko Furuichi:
A statistical phonemic segment model for speech recognition based on automatic phonemic segmentation. - Kris Demuynck, Jacques Duchateau, Dirk Van Compernolle, Patrick Wambacq:
Improved feature decorrelation for HMM-based speech recognition. - Johan A. du Preez, David M. Weber:
Efficient high-order hidden Markov modelling. - Ellen Eide, Lalit R. Bahl:
A time-synchronous, tree-based search strategy in the acoustic fast match of an asynchronous speech recognition system. - Jürgen Fritsch, Michael Finke, Alex Waibel:
Effective structural adaptation of LVCSR systems to unseen domains using hierarchical connectionist acoustic models. - Aravind Ganapathiraju, Jonathan Hamaker, Joseph Picone:
Support vector machines for speech recognition. - Malan B. Gandhi:
Natural number recognition using discriminatively trained inter-word context dependent hidden Markov models. - Jonathan Hamaker, Aravind Ganapathiraju, Joseph Picone:
Information theoretic approaches to model selection. - Kengo Hanai, Kazumasa Yamamoto, Nobuaki Minematsu, Seiichi Nakagawa:
Continuous speech recognition using segmental unit input HMMs with a mixture of probability density functions and context dependency. - Jacques Simonin, Lionel Delphin-Poulat, Géraldine Damnati:
Gaussian density tree structure in a multi-Gaussian HMM-based speech recognition system. - Hiroaki Kojima, Kazuyo Tanaka:
Generalized phone modeling based on piecewise linear segment lattice. - Ryosuke Koshiba, Mitsuyoshi Tachimori, Hiroshi Kanazawa:
A flexible method of creating HMM using block-diagonalization of covariance matrices. - Cristina Chesta, Pietro Laface, Franco Ravera:
HMM topology selection for accurate acoustic and duration modeling. - Tan Lee, Rolf Carlson, Björn Granström:
Context-dependent duration modelling for continuous speech recognition. - Brian Mak, Enrico Bocchieri:
Training of context-dependent subspace distribution clustering hidden Markov model. - Cesar Martín del Alamo, Luis Villarrubia, Francisco Javier González, Luis A. Hernández Gómez:
Unsupervised training of HMMs with variable number of mixture components per state. - Máté Szarvas, Shoichi Matsunaga:
Acoustic observation context modeling in segment based speech recognition. - Ji Ming, Philip Hanna, Darryl Stewart, Saeed Vaseghi, Francis Jack Smith:
Capturing discriminative information using multiple modeling techniques. - Laurence Molloy, Stephen Isard:
Suprasegmental duration modelling with elastic constraints in automatic speech recognition. - Albino Nogueiras Rodríguez, José B. Mariño, Enric Monte:
An adaptive gradient-search based algorithm for discriminative training of HMM's. - Albino Nogueiras Rodríguez, José B. Mariño:
Task adaptation of sub-lexical unit models using the minimum confusibility criterion on task independent databases. - Gordon Ramsay:
Stochastic calculus, non-linear filtering, and the internal model principle: implications for articulatory speech recognition. - Christian Wellekens, Jussi Kangasharju, Cedric Milesi:
The use of meta-HMM in multistream HMM training for automatic speech recognition. - Christian Wellekens:
Enhanced ASR by acoustic feature filtering. - Christoph Neukirchen, Daniel Willett, Gerhard Rigoll:
Soft state-tying for HMM-based speech recognition. - Silke M. Witt, Steve J. Young:
Estimation of models for non-native speech in computer-assisted language learning based on linear model combination. - Tae-Young Yang, Ji-Sung Kim, Chungyong Lee, Dae Hee Youn, Il-Whan Cha:
Duration modeling using cumulative duration probability and speaking rate compensation. - Geoffrey Zweig, Stuart Russell:
Probabilistic modeling with Bayesian networks for automatic speech recognition.
Speaker and Language Recognition 1-4
- Perasiriyan Sivakumaran, Aladdin M. Ariyaeeinia, Jill A. Hewitt:
Sub-band based speaker verification using dynamic recombination weights. - Michael Barlow, Michael Wagner:
Measuring the dynamic encoding of speaker identity and dialect in prosodic parameters. - Nicole Beringer, Florian Schiel, Peter Regel-Brietzmann:
German regional variants - a problem for automatic speech recognition? - Kay Berkling, Marc A. Zissman, Julie Vonwiller, Christopher Cleirigh:
Improving accent identification through knowledge of English syllable structure. - Zinny S. Bond, Donald Fucci, Verna Stockmal, Douglas McColl:
Multi-dimensional scaling of listener responses to complex auditory stimuli. - Verna Stockmal, Danny R. Moates, Zinny S. Bond:
Same talker, different language. - Susanne Burger, Daniela Oppermann:
The impact of regional variety upon specific word categories in spontaneous German. - Dominique Genoud, Gérard Chollet:
Speech pre-processing against intentional imposture in speaker recognition. - Mike Lincoln, Stephen Cox, Simon Ringland:
A comparison of two unsupervised approaches to accent identification. - Dominik R. Dersch, Christopher Cleirigh, Julie Vonwiller:
The influence of accents in australian English vowels and their relation to articulatory tract parameters. - Johan A. du Preez, David M. Weber:
Automatic language recognition using high-order HMMs. - Marcos Faúndez-Zanuy, Daniel Rodriguez-Porcheron:
Speaker recognition using residual signal of linear and nonlinear prediction models. - Yong Gu, Trevor Thomas:
An implementation and evaluation of an on-line speaker verification system for field trials. - Javier Hernando, Climent Nadeu:
Speaker verification on the polycost database using frequency filtered spectral energies. - Qin Jin, Luo Si, Qixiu Hu:
A high-performance text-independent speaker identification system based on BCDM. - Hiroshi Kido, Hideki Kasuya:
Representation of voice quality features associated with talker individuality. - Ji-Hwan Kim, Gil-Jin Jang, Seong-Jin Yun, Yung-Hwan Oh:
Candidate selection based on significance testing and its use in normalisation and scoring. - Yuko Kinoshita:
Japanese forensic phonetics: non-contemporaneous within-speaker variation in natural and read-out speech. - Filipp Korkmazskiy, Biing-Hwang Juang:
Statistical modeling of pronunciation and production variations for speech recognition. - Arne Kjell Foldvik, Knut Kvale:
Dialect maps and dialect research; useful tools for automatic speech recognition? - Youn-Jeong Kyung, Hwang-Soo Lee:
Text independent speaker recognition using micro-prosody. - Yoik Cheng, Hong C. Leung:
Speaker verification using fundamental frequency. - Weijie Liu, Toshihiro Isobe, Naoki Mukawa:
On optimum normalization method used for speaker verification. - Harvey Lloyd-Thomas, Eluned S. Parris, Jeremy H. Wright:
Recurrent substrings and data fusion for language recognition. - Konstantin P. Markov, Seiichi Nakagawa:
Text-independent speaker recognition using multiple information sources. - Konstantin P. Markov, Seiichi Nakagawa:
Discriminative training of GMM using a modified EM algorithm for speaker recognition. - Driss Matrouf, Martine Adda-Decker, Lori Lamel, Jean-Luc Gauvain:
Language identification incorporating lexical information. - Enric Monte, Ramon Arqué, Xavier Miró:
A VQ based speaker recognition system based in histogram distances. text independent and for noisy environments. - Asunción Moreno, José B. Mariño:
Spanish dialects: phonetic transcription. - Mieko Muramatsu:
Acoustic analysis of Japanese English prosody: comparison between fukushima dialect speakers and tokyo dialect speakers in declarative sentences and yes-no questions. - Hideki Noda, Katsuya Harada, Eiji Kawaguchi, Hidefumi Sawai:
A context-dependent approach for speaker verification using sequential decision. - Javier Ortega-Garcia, Santiago Cruz-Llanas, Joaquin Gonzalez-Rodriguez:
Quantitative influence of speech variability factors for automatic speaker verification in forensic tasks. - Thilo Pfau, Günther Ruske:
Creating hidden Markov models for fast speech. - Tuan D. Pham, Michael Wagner:
Speaker identification using relaxation labeling. - Leandro Rodríguez Liñares, Carmen García-Mateo:
A novel technique for the combination of utterance and speaker verification systems in a text-dependent speaker verification task. - Phil Rose:
A forensic phonetic investigation into non-contemporaneous variation in the f-pattern of similar-sounding speakers. - Astrid Schmidt-Nielsen, Thomas H. Crystal:
Human vs. machine speaker identification with telephone speech. - Stefan Slomka, Sridha Sridharan, Vinod Chandran:
A comparison of fusion techniques in mel-cepstral based speaker identification. - Hagen Soltau, Alex Waibel:
On the influence of hyperarticulated speech on recognition performance. - Nuala C. Ward, Dominik R. Dersch:
Text-independent speaker identification and verification using the TIMIT database. - Lisa Yanguas, Gerald C. O'Leary, Marc A. Zissman:
Incorporating linguistic knowledge into automatic dialect identification of Spanish. - Yiying Zhang, Xiaoyan Zhu:
A novel text-independent speaker verification method using the global speaker model. - Aaron E. Rosenberg, Ivan Magrin-Chagnolleau, Sarangarajan Parthasarathy, Qian Huang:
Speaker detection in broadcast speech databases. - Eluned S. Parris, Michael J. Carey:
Multilateral techniques for speaker recognition. - Masafumi Nishida, Yasuo Ariki:
Real time speaker indexing based on subspace method - application to TV news articles and debate. - George R. Doddington, Walter Liggett, Alvin F. Martin, Mark A. Przybocki, Douglas A. Reynolds:
SHEEP, GOATS, LAMBS and WOLVES: a statistical analysis of speaker performance in the NIST 1998 speaker recognition evaluation. - Andrés Corrada-Emmanuel, Michael Newman, Barbara Peskin, Larry Gillick, Robert Roth:
Progress in speaker recognition at dragon systems. - Tomas Nordström, Håkan Melin, Johan Lindberg:
A comparative study of speaker verification systems using the polycost database. - Tomoko Matsui, Kiyoaki Aikawa:
Robust speaker verification insensitive to session-dependent utterance variation and handset-dependent distortion. - Håkan Melin, Johan Koolwaaij, Johan Lindberg, Frédéric Bimbot:
A comparative evaluation of variance flooring techniques in HMM-based speaker verification. - Dijana Petrovska-Delacrétaz, Jan Cernocký, Jean Hennebert, Gérard Chollet:
Text-independent speaker verification using automatically labelled acoustic segments. - Qi Li:
A fast decoding algorithm based on sequential detection of the changes in distribution. - Jesper Østergaard Olsen:
Speaker verification with ensemble classifiers based on linear speech transforms. - Jesper Østergaard Olsen:
Speaker recognition based on discriminative projection models. - James Moody, Stefan Slomka, Jason W. Pelecanos, Sridha Sridharan:
On the convergence of Gaussian mixture models: improvements through vector quantization. - M. Kemal Sönmez, Elizabeth Shriberg, Larry P. Heck, Mitchel Weintraub:
Modeling dynamic prosodic variation for speaker verification. - Douglas A. Reynolds, Elliot Singer, Beth A. Carlson, Gerald C. O'Leary, Jack McLaughlin, Marc A. Zissman:
Blind clustering of speech utterances based on speaker and language characteristics. - Diamantino Caseiro, Isabel Trancoso:
Spoken language identification using the speechdat corpus. - Jerome Braun, Haim Levkowitz:
Automatic language identification with perceptually guided training and recurrent neural networks. - Sarel van Vuuren, Hynek Hermansky:
On the importance of components of the modulation spectrum for speaker verification.
Multimodal Spoken Language Processing 1-3
- Andrew P. Breen, O. Gloaguen, P. Stern:
A fast method of producing talking head mouth shapes from real speech. - Philip R. Cohen, Michael Johnston, David McGee, Sharon L. Oviatt, Josh Clow, Ira A. Smith:
The efficiency of multimodal interaction: a case study. - László Czap:
Audio and audio-visual perception of consonants disturbed by white noise and 'cocktail party'. - Simon Downey, Andrew P. Breen, Maria Fernández, Edward Kaneen:
Overview of the maya spoken language system. - Mauro Cettolo, Daniele Falavigna:
Automatic recognition of spontaneous speech dialogues. - Georg Fries, Stefan Feldes, Alfred Corbet:
Using an animated talking character in a web-based city guide demonstrator. - Rika Kanzaki, Takashi Kato:
Influence of facial views on the mcgurk effect in auditory noise. - Tom Brøndsted, Lars Bo Larsen, Michael Manthey, Paul McKevitt, Thomas B. Moeslund, Kristian G. Olesen:
The intellimedia workbench - a generic environment for multimodal systems. - Josh Clow, Sharon L. Oviatt:
STAMP: a suite of tools for analyzing multimodal system processing. - Sumi Shigeno:
Cultural similarities and differences in the recognition of audio-visual speech stimuli. - Toshiyuki Takezawa, Tsuyoshi Morimoto:
A multimodal-input multimedia-output guidance system: MMGS. - Oscar Vanegas, Akiji Tanaka, Keiichi Tokuda, Tadashi Kitamura:
HMM-based visual speech recognition using intensity and location normalization. - Yanjun Xu, Limin Du, Guoqiang Li, Ziqiang Hou:
A hierarchy probability-based visual features extraction method for speechreading. - Jörn Ostermann
, Mark C. Beutnagel, Ariel Fischer, Yao Wang:
Integration of talking heads and text-to-speech synthesizers for visual TTS. - Levent M. Arslan, David Talkin:
Speech driven 3-d face point trajectory synthesis algorithm. - Eli Yamamoto, Satoshi Nakamura, Kiyohiro Shikano:
Speech-to-lip movement synthesis based on the EM algorithm using audio-visual HMMs. - Deb Roy, Alex Pentland:
Learning words from natural audio-visual input. - Stéphane Dupont, Juergen Luettin
:
Using the multi-stream approach for continuous audio-visual speech recognition: experiments on the M2VTS database. - Sharon L. Oviatt, Karen Kuhn:
Referential features and linguistic indirection in multimodal language. - Michael Johnston:
Multimodal language processing. - Jun-ichi Hirasawa, Noboru Miyazaki, Mikio Nakano, Takeshi Kawabata:
Implementation of coordinative nodding behavior on spoken dialogue systems. - Masao Yokoyama, Kazumi Aoyama, Hideaki Kikuchi, Katsuhiko Shirai:
Use of non-verbal information in communication between human and robot. - Steve Whittaker, John Choi, Julia Hirschberg, Christine H. Nakatani:
What you see is (almost) what you hear: design principles for user interfaces for accessing speech archives.
Isolated Word Recognition
- Daniel Azzopardi, Shahram Semnani, Ben Milner, Richard Wiseman:
Improving accuracy of telephony-based, speaker-independent speech recognition. - Aruna Bayya:
Rejection in speech recognition systems with limited training. - Ruxin Chen, Miyuki Tanaka, Duanpei Wu, Lex Olorenshaw, Mariscela Amador:
A four layer sharing HMM system for very large vocabulary isolated word recognition. - Rathinavelu Chengalvarayan:
A comparative study of hybrid modelling techniques for improved telephone speech recognition. - Jae-Seung Choi, Jong-Seok Lee, Hee-Youn Lee:
Smoothing and tying for Korean flexible vocabulary isolated word recognition. - Javier Ferreiros, Javier Macías Guarasa, Ascensión Gallardo-Antolín, José Colás, Ricardo de Córdoba, José Manuel Pardo, Luis Villarrubia Grande:
Recent work on a preselection module for a flexible large vocabulary speech recognition system in telephone environment. - Masakatsu Hoshimi, Maki Yamada, Katsuyuki Niyada, Shozo Makino:
A study of noise robustness for speaker independent speech recognition method using phoneme similarity vector. - Fran H. L. Jian:
Classification of taiwanese tones based on pitch and energy movements. - Finn Tore Johansen:
Phoneme-based recognition for the norwegian speechdat(II) database. - Montri Karnjanadecha, Stephen A. Zahorian:
Robust feature extraction for alphabet recognition. - Hisashi Kawai, Norio Higuchi:
Recognition of connected digit speech in Japanese collected over the telephone network. - Takuya Koizumi, Shuji Taniguchi, Kazuhiro Kohtoh:
Improving the speaker-dependency of subword-unit-based isolated word recognition. - Tomohiro Konuma, Tetsu Suzuki, Maki Yamada, Yoshio Ono, Masakatsu Hoshimi, Katsuyuki Niyada:
Speaker independent speech recognition method using constrained time alignment near phoneme discriminative frame. - Ki Yong Lee, Joohun Lee:
A nonstationary autoregressive HMM with gain adaptation for speech recognition. - Ren-Yuan Lyu, Yuang-jin Chiang, Wen-ping Hsieh:
A large-vocabulary taiwanese (MIN-NAN) multi-syllabic word recognition system based upon right-context-dependent phones with state clustering by acoustic decision tree. - Kazuyo Tanaka, Hiroaki Kojima:
Speech recognition based on the distance calculation between intermediate phonetic code sequences in symbolic domain. - York Chung-Ho Yang, June-Jei Kuo:
High accuracy Chinese speech recognition approach with Chinese input technology for telecommunication use.
Robust Speech Processing in Adverse Environments 1-5
- William J. J. Roberts, Yariv Ephraim:
Robust speech recognition using HMM's with toeplitz state covariance matrices. - David P. Thambiratnam, Sridha Sridharan:
Modeling of output probability distribution to improve small vocabulary speech recognition in adverse environments. - Philippe Morin, Ted H. Applebaum, Robert Boman, Yi Zhao, Jean-Claude Junqua:
Robust and compact multilingual word recognizers using features extracted from a phoneme similarity front-end. - Takeshi Yamada, Satoshi Nakamura, Kiyohiro Shikano:
An effect of adaptive beamforming on hands-free speech recognition based on 3-d viterbi search. - Joaquin Gonzalez-Rodriguez, Santiago Cruz-Llanas, Javier Ortega-Garcia:
Coherence-based subband decomposition for robust speech and speaker recognition in noisy and reverberant rooms. - Hui Jiang, Keikichi Hirose, Qiang Huo:
A minimax search algorithm for CDHMM based robust continuous speech recognition. - Su-Lin Wu, Brian Kingsbury, Nelson Morgan, Steven Greenberg:
Performance improvements through combining phone- and syllable-scale information in automatic speech recognition. - Arun C. Surendran, Chin-Hui Lee:
Predictive adaptation and compensation for robust speech recognition. - Jean-Claude Junqua, Steven Fincke, Kenneth L. Field:
Influence of the speaking style and the noise spectral tilt on the lombard reflex and automatic speech recognition. - Stefano Crafa, Luciano Fissore, Claudio Vair:
Data-driven PMC and Bayesian learning integration for fast model adaptation in noisy conditions. - Martin Hunke, Meeran Hyun, Steve Love, Thomas Holton:
Improving the noise and spectral robustness of an isolated-word recognizer using an auditory-model front end. - Owen P. Kenny, Douglas J. Nelson:
A model for speech reverberation and intelligibility restoring filters. - Guojun Zhou, John H. L. Hansen, James F. Kaiser:
Linear and nonlinear speech feature analysis for stress classification. - Sahar E. Bou-Ghazale, John H. L. Hansen:
Speech feature modeling for robust stressed speech recognition. - Katrin Kirchhoff:
Combining articulatory and acoustic information for speech recognition in noisy and reverberant environments. - Tim Wark, Sridha Sridharan:
Improving speaker identification performance in reverberant conditions using lip information. - Masato Akagi, Mamoru Iwaki, Noriyoshi Sakaguchi:
Spectral sequence compensation based on continuity of spectral sequence. - Aruna Bayya, B. Yegnanarayana:
Robust features for speech recognition systems. - Frédéric Berthommier, Hervé Glotin, Emmanuel Tessier, Hervé Bourlard:
Interfacing of CASA and partial recognition based on a multistream technique. - Sen-Chia Chang, Shih-Chieh Chien, Chih-Chung Kuo:
AN RNN-based compensation method for Mandarin telephone speech recognition. - Stephen M. Chu, Yunxin Zhao:
Robust speech recognition using discriminative stream weighting and parameter interpolation. - Johan de Veth, Bert Cranen, Lou Boves:
Acoustic backing-off in the local distance computation for robust automatic speech recognition. - Laura Docío Fernández, Carmen García-Mateo:
Noise model selection for robust speech recognition. - Simon Doclo, Ioannis Dologlou, Marc Moonen:
A novel iterative signal enhancement algorithm for noise reduction in speech. - Stéphane Dupont:
Missing data reconstruction for robust automatic speech recognition in the framework of hybrid HMM/ANN systems. - Ascensión Gallardo-Antolín, Fernando Díaz-de-María, Francisco J. Valverde-Albacete:
Recognition from GSM digital speech. - Petra Geutner, Matthias Denecke, Uwe Meier, Martin Westphal, Alex Waibel:
Conversational speech systems for on-board car navigation and assistance. - Laurent Girin, Laurent Varin, Gang Feng, Jean-Luc Schwartz:
A signal processing system for having the sound "pop-out" in noise thanks to the image of the speaker's lips: new advances using multi-layer perceptrons. - Ruhi Sarikaya, John H. L. Hansen:
Robust speech activity detection in the presence of noise. - Michel Héon, Hesham Tolba, Douglas D. O'Shaughnessy:
Robust automatic speech recognition by the application of a temporal-correlation-based recurrent multilayer neural network to the mel-based cepstral coefficients. - Juan M. Huerta, Richard M. Stern:
Speech recognition from GSM codec parameters. - Jeih-Weih Hung, Jia-Lin Shen, Lin-Shan Lee:
Improved parallel model combination based on better domain transformation for speech recognition under noisy environments. - Lamia Karray, Jean Monné:
Robust speech/non-speech detection in adverse conditions based on noise and speech statistics. - Myung Gyu Song, Hoi In Jung, Kab-Jong Shim, Hyung Soon Kim:
Speech recognition in car noise environments using multiple models according to noise masking levels. - Klaus Linhard, Tim Haulick:
Spectral noise subtraction with recursive gain curves. - Shengxi Pan, Jia Liu, Jintao Jiang, Zuoying Wang, Dajin Lu:
A novel robust speech recognition algorithm based on multi-models and integrated decision method. - Dusan Macho, Climent Nadeu:
On the interaction between time and frequency filtering of speech parameters for robust speech recognition. - Bhiksha Raj, Rita Singh, Richard M. Stern:
Inference of missing spectrographic features for robust speech recognition. - Volker Schless, Fritz Class:
SNR-dependent flooring and noise overestimation for joint application of spectral subtraction and model combination. - Jia-Lin Shen, Jeih-Weih Hung, Lin-Shan Lee:
Improved robust speech recognition considering signal correlation approximated by taylor series. - Won-Ho Shin, Weon-Goo Kim, Chungyong Lee, Il-Whan Cha:
Speech recognition in noisy environment using weighted projection-based likelihood measure. - Tetsuya Takiguchi, Satoshi Nakamura, Kiyohiro Shikano, Masatoshi Morishima, Toshihiro Isobe:
Evaluation of model adaptation by HMM decomposition on telephone speech recognition. - Hesham Tolba, Douglas D. O'Shaughnessy:
Comparative experiments to evaluate a voiced-unvoiced-based pre-processing approach to robust automatic speech recognition in low-SNR environments. - Masashi Unoki, Masato Akagi:
Signal extraction from noisy signal based on auditory scene analysis. - Tsuyoshi Usagawa, Kenji Sakai, Masanao Ebata:
Frequency domain binaural model as the front end of speech recognition system. - An-Tzyh Yu, Hsiao-Chuan Wang:
A study on the recognition of low bit-rate encoded speech. - Tai-Hwei Hwang, Hsiao-Chuan Wang:
Weighted parallel model combination for noisy speech recognition. - Daniel Woo:
Favourable and unfavourable short duration segments of speech in noise. - Piero Cosi, Stefano Pasquin, Enrico Zovato:
Auditory modeling techniques for robust pitch extraction and noise reduction. - Eliathamby Ambikairajah, Graham Tattersall, Andrew Davis:
Wavelet transform-based speech enhancement. - Beth Logan, Tony Robinson:
A practical perceptual frequency autoregressive HMM enhancement system. - John H. L. Hansen, Bryan L. Pellom:
An effective quality evaluation protocol for speech enhancement algorithms. - Jin-Nam Park, Tsuyoshi Usagawa, Masanao Ebata:
An adaptive beamforming microphone array system using a blind deconvolution. - Latchman Singh, Sridha Sridharan:
Speech enhancement using critical band spectral subtraction.
Articulatory Modelling 1-2
- Pierre Badin, Gérard Bailly, Monica Raybaudi, Christoph Segebarth:
A three-dimensional linear articulatory model based on MRI data. - Pascal Perrier, Yohan Payan, Joseph S. Perkell, Frédéric Jolly, Majid Zandipour, Melanie Matthies:
On loops and articulatory biomechanics. - Didier Demolin, Véronique Lecuit, Thierry Metens, Bruno Nazarian, Alain Soquet:
Magnetic resonance measurements of the velum port opening. - Masafumi Matsumura, Takuya Niikawa, Takao Tanabe, Takashi Tachimura, Takeshi Wada:
Cantilever-type force-sensor-mounted palatal plate for measuring palatolingual contact stress and pattern during speech phonation. - Tokihiko Kaburagi, Masaaki Honda:
Determination of the vocal tract spectrum from the articulatory movements based on the search of an articulatory-acoustic database. - Kiyoshi Honda, Mark Tiede:
An MRI study on the relationship between oral cavity shape and larynx position. - Frantz Clermont, Parham Mokhtari:
Acoustic-articulatory evaluation of the upper vowel-formant region and its presumed speaker-specific potency. - Philip Hoole, Christian Kroos:
Control of larynx height in vowel production. - Paavo Alku, Juha Vintturi, Erkki Vilkman:
Analyzing the effect of secondary excitations of the vocal tract on vocal intensity in different loudness conditions. - Gordon Ramsay:
An analysis of modal coupling effects during the glottal cycle: formant synthesizers from time-domain finite-difference simulations. - John H. Esling:
Laryngoscopic analysis of pharyngeal articulations and larynx-height voice quality settings. - Hiroki Matsuzaki, Kunitoshi Motoki, Nobuhiro Miki:
Effects of shapes of radiational aperture on radiation characteristics. - Jonathan Harrington, Mary E. Beckman, Janet Fletcher, Sallyanne Palethorpe
:
An electropalatographic, kinematic, and acoustic analysis of supralaryngeal correlates of word-level prominence contrasts in English. - Marija Tabain:
Consistencies and inconsistencies between EPG and locus equation data on coarticulation. - Gérard Bailly, Pierre Badin, Anne Vilain:
Synergy between jaw and lips/tongue movements : consequences in articulatory modelling. - Philip Hoole:
Modelling tongue configuration in German vowel production. - Alan Wrench, Alan D. McIntosh, Colin Watson, William J. Hardcastle:
Optopalatograph: real-time feedback of tongue movement in 3D. - Yohann Meynadier, Michel Pitermann, Alain Marchal:
Effects of contrastive focal accent on linguopalatal articulation and coarticulation in the French [kskl] cluster.
Talking to Infants, Pets and Lovers
- Christine Kitamura, Denis Burnham:
Acoustic and affective qualities of IDS in English. - Sudaporn Luksaneeyanawin, Chayada Thanavisuth, Suthasinee Sittigasorn, Onwadee Rukkarangsarit:
Pragmatic characteristics of infant directed speech. - Denis Burnham, Elizabeth Francis, Ute Vollmer-Conna, Christine Kitamura, Vicky Averkiou, Amanda Olley, Mary Nguyen, Cal Paterson:
Are you my little pussy-cat? acoustic, phonetic and affective qualities of infant- and pet-directed speech. - Denis Burnham:
Special speech registers: talking to australian and Thai infants, and to pets.
Speech Coding 1-3
- Takashi Masuko, Keiichi Tokuda, Takao Kobayashi:
A very low bit rate speech coder using HMM with speaker adaptation. - Erik Ekudden, Roar Hagen, Björn Johansson, Shinji Hayashi, Akitoshi Kataoka, Sachiko Kurihara:
ITU-t g.729 extension at 6.4 kbps. - Damith J. Mudugamuwa, Alan B. Bradley:
Adaptive transformation for segmented parametric speech coding. - Julien Epps, W. Harvey Holmes:
Speech enhancement using STC-based bandwidth extension. - Weihua Zhang, W. Harvey Holmes:
Performance and optimization of the SEEVOC algorithm. - Wendy J. Holmes:
Towards a unified model for low bit-rate speech coding using a recognition-synthesis approach. - Jan Skoglund, W. Bastiaan Kleijn
:
On the significance of temporal masking in speech coding. - W. Bastiaan Kleijn
, Huimin Yang, Ed F. Deprettere:
Waveform interpolation coding with pitch-spaced subbands. - Nicola R. Chong, Ian S. Burnett, Joe F. Chicharo:
An improved decomposition method for WI using IIR wavelet filter banks. - Paavo Alku, Susanna Varho:
A new linear predictive method for compression of speech signals. - Shahrokh Ghaemmaghami, Mohamed A. Deriche, Sridha Sridharan:
Hierarchical temporal decomposition: a novel approach to efficient compression of spectral characteristics of speech. - Susan L. Hura:
Speech intelligibility testing for new technologies. - Sung-Joo Kim, Sangho Lee, Woo-Jin Han, Yung-Hwan Oh:
Efficient quantization of LSF parameters based on temporal decomposition. - Minoru Kohata:
A sinusoidal harmonic vocoder at 1.2 kbps using auditory perceptual characteristics. - Kazuhito Koishida, Gou Hirabayashi, Keiichi Tokuda, Takao Kobayashi:
A 16 kbit/s wideband CELP coder using MEL-generalized cepstral analysis and its subjective evaluation. - Derek J. Molyneux, C. I. Parris, Xiaoqin Sun, Barry M. G. Cheetham:
Comparison of spectral estimation techniques for low bit-rate speech coding. - Yoshihisa Nakatoh, Takeshi Norimatsu, Ah Heng Low, Hiroshi Matsumoto:
Low bit rate coding for speech and audio using mel linear predictive coding (MLPC) analysis. - Jeng-Shyang Pan, Chin-Shiuh Shieh, Shu-Chuan Chu:
Comparison study on VQ codevector index assignment. - John J. Parry, Ian S. Burnett, Joe F. Chicharo:
Using linguistic knowledge to improve the design of low-bit rate LSF quantisation. - Davor Petrinovic:
Transform coding of LSF parameters using wavelets. - Fabrice Plante, Barry M. G. Cheetham, David F. Marston, P. A. Barrett:
Source controlled variable bit-rate speech coder based on waveform interpolation. - Carlos M. Ribeiro, Isabel Trancoso:
Improving speaker recognisability in phonetic vocoders.
Neural Networks, Fuzzy and Evolutionary Methods 1
- Visarut Ahkuputra, Somchai Jitapunkul, Nutthacha Jittiwarangkul, Ekkarit Maneenoi, Sawit Kasuriya:
A comparison of Thai speech recognition systems using hidden Markov model, neural network, and fuzzy-neural network. - Felix Freitag, Enric Monte:
Phoneme recognition with statistical modeling of the prediction error of neural networks. - Toshiaki Fukada, Takayoshi Yoshimura, Yoshinori Sagisaka:
Neural network based pronunciation modeling with applications to speech recognition. - Stephen J. Haskey, Sekharajit Datta:
A comparative study of OCON and MLP architectures for phoneme recognition. - John-Paul Hosom, Ronald A. Cole, Piero Cosi:
Evaluation and integration of neural-network training techniques for continuous digit recognition. - Ying Jia, Limin Du, Ziqiang Hou:
Hierarchical neural networks (HNN) for Chinese continuous speech recognition. - Eric Keller:
Neural network motivation for segmental distribution. - Nikki Mirghafori, Nelson Morgan:
Combining connectionist multi-band and full-band probability streams for speech recognition of natural numbers. - Ednaldo Brigante Pizzolato, T. Jeff Reynolds:
Initial speech recognition results using the multinet architecture. - Tomio Takara, Yasushi Iha, Itaru Nagayama:
Selection of the optimal structure of the continuous HMM using the genetic algorithm. - Dat Tran, Michael Wagner, Tu Van Le:
A proposed decision rule for speaker recognition based on fuzzy c-means clustering. - Dat Tran, Tu Van Le, Michael Wagner:
Fuzzy Gaussian mixture models for speaker recognition. - Chai Wutiwiwatchai, Somchai Jitapunkul, Visarut Ahkuputra, Ekkarit Maneenoi, Sudaporn Luksaneeyanawin:
A new strategy of fuzzy-neural network for Thai numeral speech recognition. - Chai Wutiwiwatchai, Somchai Jitapunkul, Visarut Ahkuputra, Ekkarit Maneenoi, Sudaporn Luksaneeyanawin:
Thai polysyllabic word recognition using fuzzy-neural network. - Axel Glaeser:
Modular neural networks for low-complex phoneme recognition. - João F. G. de Freitas, Sue E. Johnson, Mahesan Niranjan, Andrew H. Gee:
Global optimisation of neural network models via sequential sampling-importance resampling. - Jörg Rottland, Andre Ludecke, Gerhard Rigoll:
Efficient computation of MMI neural networks for large vocabulary speech recognition systems. - Sid-Ahmed Selouani, Jean Caelen:
Modular connectionist systems for identifying complex arabic phonetic features. - Tuan D. Pham, Michael Wagner:
Fuzzy-integration based normalization for speaker verification. - Hiroshi Shimodaira, Jun Rokui, Mitsuru Nakai:
Improving the generalization performance of the MCE/GPD learning. - Tetsuro Kitazoe, Tomoyuki Ichiki, Sung-Ill Kim:
Acoustic speech recognition model by neural net equation with competition and cooperation. - Julie Ngan, Aravind Ganapathiraju, Joseph Picone:
Improved surname pronunciations using decision trees.
Utterance Verification and Word Spotting 1 / Speaker Adaptation 1
- M. Carmen Benítez, Antonio J. Rubio, Pedro García-Teodoro, Jesús Esteban Díaz Verdejo:
Word verification using confidence measures in speech recognition. - Giulia Bernardis, Hervé Bourlard:
Improving posterior based confidence measures in hybrid HMM/ANN speech recognition systems. - Javier Caminero, Eduardo López, Luis A. Hernández Gómez:
Two-pass utterance verification algorithm for long natural numbers recognition. - Berlin Chen, Hsin-Min Wang, Lee-Feng Chien, Lin-Shan Lee:
A*-admissible key-phrase spotting with sub-syllable level utterance verification. - Volker Fischer, Yuqing Gao, Eric Janke:
Speaker-independent upfront dialect adaptation in a large vocabulary continuous speech recognizer. - Asela Gunawardana, Hsiao-Wuen Hon, Li Jiang:
Word-based acoustic confidence measures for large-vocabulary speech recognition. - Sunil K. Gupta, Frank K. Soong:
Improved utterance rejection using length dependent thresholds. - Ching-Hsiang Ho, Saeed Vaseghi, Aimin Chen:
Bayesian constrained frequency warping HMMS for speaker normalisation. - Masaki Ida, Ryuji Yamasaki:
An evaluation of keyword spotting performance utilizing false alarm rejection based on prosodic information. - Dieu Tran, Ken-ichi Iso:
Predictive speaker adaptation and its prior training. - Rachida El Méliani, Douglas D. O'Shaughnessy:
Powerful syllabic fillers for general-task keyword-spotting and unlimited-vocabulary continuous-speech recognition. - Christine Pao, Philipp Schmid, James R. Glass:
Confidence scoring for speech understanding systems. - Bhuvana Ramabhadran, Abraham Ittycheriah:
Phonological rules for enhancing acoustic enrollment of unknown words. - Anand R. Setlur, Rafid A. Sukkar:
Recognition-based word counting for reliable barge-in and early endpoint detection in continuous speech recognition. - Martin Westphal, Tanja Schultz, Alex Waibel:
Linear discriminant - a new criterion for speaker normalization. - Gethin Williams, Steve Renals:
Confidence measures derived from an acceptor HMM. - Chung-Hsien Wu, Yeou-Jiunn Chen, Yu-Chun Hung:
Telephone speech multi-keyword spotting using fuzzy search algorithm and prosodic verification. - Yoichi Yamashita, Toshikatsu Tsunekawa, Riichiro Mizoguchi:
Topic recognition for news speech based on keyword spotting.
Human Speech Perception 1-4
- Sieb G. Nooteboom, Meinou van Dijk:
Heads and tails in word perception: evidence for 'early-to-late' processing in listening and reading. - Saskia te Riele, Hugo Quené:
Evidence for early effects of sentence context on word segmentation. - Hugo Quené, Maya van Rossum, Mieke van Wijck:
Assimilation and anticipation in word perception. - M. Louise Kelly, Ellen Gurman Bard, Catherine Sotillo:
Lexical activation by assimilated and reduced tokens. - Masato Akagi, Mamoru Iwaki, Tomoya Minakawa:
Fundamental frequency fluctuation in continuous vowel utterance and its perception. - Shigeaki Amano, Tadahisa Kondo:
Estimation of mental lexicon size with word familiarity database. - Matthew P. Aylett, Alice Turk:
Vowel quality in spontaneous speech: what makes a good vowel? - Adrian Neagu, Gérard Bailly:
Cooperation and competition of burst and formant transitions for the perception and identification of French stops. - Anne Bonneau, Yves Laprie:
The effect of modifying formant amplitudes on the perception of French vowels generated by copy synthesis. - Hsuan-Chih Chen, Michael C. W. Yip, Sum-Yin Wong:
Segmental and tonal processing in Cantonese. - Michael C. W. Yip, Po-Yee Leung, Hsuan-Chih Chen:
Phonological similarity effects in Cantonese spoken-word processing. - Robert I. Damper, Steve R. Gunn:
On the learnability of the voicing contrast for initial stops. - Loredana Cerrato, Mauro Falcone:
Acoustic and perceptual characteristic of Italian stop consonants. - Santiago Fernández, Sergio Feijóo, Ramón Balsa, Nieves Barros:
Acoustic cues for the auditory identification of the Spanish fricative /f/. - Santiago Fernández, Sergio Feijóo, Ramón Balsa, Nieves Barros:
Recognition of vowels in fricative context. - Santiago Fernández, Sergio Feijóo, Plinio Almeida:
Voicing affects perceived manner of articulation. - Valérie Hazan, Andrew Simpson, Mark A. Huckvale:
Enhancement techniques to improve the intelligibility of consonants in noise : speaker and listener effects. - Fran H. L. Jian:
Boundaries of perception of long tones in taiwanese speech. - Hiroaki Kato, Minoru Tsuzaki, Yoshinori Sagisaka:
Effects of phonetic quality and duration on perceptual acceptability of temporal changes in speech. - Michael Kiefte, Terrance M. Nearey:
Dynamic vs. static spectral detail in the perception of gated stops. - Takashi Otake, Kiyoko Yoneyama:
Phonological units in speech segmentation and phonological awareness. - Elizabeth Shriberg, Andreas Stolcke:
How far do speakers back up in repairs? a quantitatve model. - Karsten Steinhauer, Kai Alter, Angela D. Friederici:
Don't blame it (all) on the pause: further ERP evidence for a prosody-induced garden-path in running speech. - Jean Vroomen, Béatrice de Gelder:
The role of stress for lexical selection in dutch. - Jyrki Tuomainen, Jean Vroomen, Béatrice de Gelder:
The perception of stressed syllables in finnish. - Kimiko Yamakawa, Ryoji Baba:
The perception of the morae with devocalized vowels in Japanese language. - Dominic W. Massaro:
Categorical perception: important phenomenon or lasting myth? - Ellen Gerrits, Bert Schouten:
Categorical perception of vowels. - Kazuhiko Kakehi, Yuki Hirose:
Suprasegmental cues for the segmentation of identical vowel sequences in Japanese. - William A. Ainsworth:
Perception of concurrent approximant-vowel syllables. - Dawn M. Behne, Peter E. Czigler, Kirk P. H. Sullivan:
Perceived Swedish vowel quantity: effects of postvocalic consonant duration. - Anne Cutler, Rebecca Treiman, Brit van Ooijen:
Orthografik inkoncistensy ephekts in foneme detektion? - Bruce L. Derwing, Terrance M. Nearey, Yeo Bom Yoon:
The effect of orthographic knowledge on the segmentation of speech. - James M. McQueen, Anne Cutler:
Spotting (different types of) words in (different types of) context. - Manjari Ohala, John J. Ohala:
Correlation between consonantal VC transitions and degree of perceptual confusion of place contrast in hindi. - David House, Dik J. Hermes, Frédéric Beaugendre:
Perception of tonal rises and falls for accentuation and phrasing in Swedish. - Steven Greenberg, Takayuki Arai, Rosaria Silipo:
Speech intelligibility derived from exceedingly sparse spectral information.
Speech and Hearing Disorders 1
- Mark C. Flynn, Richard C. Dowell, Graeme M. Clark:
Adults with a severe-to-profound hearing impairment. investigating the effects of linguistic context on speech perception. - Florien J. Koopmans-van Beinum, Caroline E. Schwippert, Cecile T. L. Kuijpers:
Speech perception in dyslexia: measurements from birth onwards. - Karen Croot:
An acoustic analysis of vowel production across tasks in a case of non-fluent progressive aphasia. - Jan van Doorn, Sharynne McLeod, Elise Baker, Alison Purcell, William Thorpe:
Speech technology in clinical environments.
Spoken Language Understanding Systems 1-4
- Stephanie Seneff, Edward Hurley, Raymond Lau, Christine Pao, Philipp Schmid, Victor Zue:
GALAXY-II: a reference architecture for conversational system development. - Grace Chung, Stephanie Seneff:
Improvements in speech understanding accuracy through the integration of hierarchical linguistic, prosodic, and phonological constraints in the jupiter domain. - Kenney Ng:
Towards robust methods for spoken document retrieval. - Richard Sproat, Jan P. H. van Santen:
Automatic ambiguity detection. - Julia Fischer, Jürgen Haas, Elmar Nöth, Heinrich Niemann, Frank Deinzer:
Empowering knowledge based speech understanding through statistics. - Akito Nagai, Yasushi Ishikawa:
Concept-driven speech understanding incorporated with a statistic language model. - José Colás, Javier Ferreiros, Juan Manuel Montero, Julio Pastor, Ascensión Gallardo-Antolín, José Manuel Pardo:
On the limitations of stochastic conceptual finite-state language models for speech understanding. - Todd Ward, Salim Roukos, Chalapathy Neti, Jerome Gros, Mark Epstein, Satya Dharanipragada:
Towards speech understanding across multiple languages. - Andreas Stolcke, Elizabeth Shriberg, Rebecca A. Bates, Mari Ostendorf, Dilek Zeynep Hakkani, Madelaine Plauché, Gökhan Tür, Yu Lu:
Automatic detection of sentence boundaries and disfluencies based on recognized words. - Wolfgang Reichl, Bob Carpenter, Jennifer Chu-Carroll, Wu Chou:
Language modeling for content extraction in human-computer dialogues. - John Gillett, Wayne H. Ward:
A language model combining trigrams and stochastic context-free grammars. - Bernd Souvignier, Andreas Kellner:
Online adaptation of language models in spoken dialogue systems. - Giuseppe Riccardi, Alexandros Potamianos, Shrikanth S. Narayanan:
Language model adaptation for spoken language systems. - Brigitte Bigi, Renato de Mori, Marc El-Bèze, Thierry Spriet:
Detecting topic shifts using a cache memory. - Lori S. Levin, Ann E. Thymé-Gobbel, Alon Lavie, Klaus Ries, Klaus Zechner:
A discourse coding scheme for conversational Spanish. - Kazuhiro Arai, Jeremy H. Wright, Giuseppe Riccardi, Allen L. Gorin:
Grammar fragment acquisition using syntactic and semantic clustering. - Tom Brøndsted:
Non-expert access to unification based speech understanding. - Bob Carpenter, Jennifer Chu-Carroll:
Natural language call routing: a robust, self-organizing approach. - Debajit Ghosh, David Goddeau:
Automatic grammar induction from semantic parsing. - Yasuyuki Kono, Takehide Yano, Munehiko Sasajima:
BTH: an efficient parsing algorithm for word-spotting. - Susanne Kronenberg, Franz Kummert:
Syntax coordination: interaction of discourse and extrapositions. - Bor-Shen Lin, Berlin Chen, Hsin-Min Wang, Lin-Shan Lee:
Hierarchical tag-graph search for spontaneous speech understanding in spoken dialog systems. - Yasuhisa Niimi, Noboru Takinaga, Takuya Nishimoto:
Extraction of the dialog act and the topic from utterances in a spoken dialog system. - Harry Printz:
Fast computation of maximum entropy / minimum divergence feature gain. - Giuseppe Riccardi, Allen L. Gorin:
Stochastic language models for speech recognition and understanding. - Carol Van Ess-Dykema, Klaus Ries:
Linguistically engineered tools for speech recognition error analysis. - Kazuya Takeda, Atsunori Ogawa, Fumitada Itakura:
Estimating entropy of a language from optimal word insertion penalty. - Shu-Chuan Tseng:
A linguistic analysis of repair signals in co-operative spoken dialogues. - Francisco J. Valverde-Albacete, José Manuel Pardo:
A hierarchical language model for CSR. - Jeremy H. Wright, Allen L. Gorin, Alicia Abella:
Spoken language understanding within dialogs using a graphical model of task structure. - Yoshimi Suzuki, Fumiyo Fukumoto, Yoshihiro Sekiguchi:
Keyword extraction of radio news using domain identification based on categories of an encyclopedia.
Signal Processing and Speech Analysis 1-3
- James Droppo, Alex Acero:
Maximum a posteriori pitch tracking. - Dekun Yang, Georg F. Meyer, William A. Ainsworth:
Vowel separation using the reassigned amplitude-modulation spectrum. - Eloi Batlle, Climent Nadeu, José A. R. Fonollosa:
Feature decorrelation methods in speech recognition. a comparative study. - Marie-José Caraty, Claude Montacié:
Multi-resolution for speech analysis. - Steve Cassidy, Catherine I. Watson:
Dynamic features in children's vowels. - Johan de Veth, Lou Boves:
Effectiveness of phase-corrected rasta for continuous speech recognition. - Satya Dharanipragada, Ramesh A. Gopinath, Bhaskar D. Rao:
Techniques for capturing temporal variations in speech signals with fixed-rate processing. - Limin Du, Kenneth N. Stevens:
Automatic detection of landmark for nasal consonants from speech waveform. - Thierry Dutoit, Juergen Schroeter:
Plug and play software for designing high-level speech processing systems. - Alexandre Girardi, Kiyohiro Shikano, Satoshi Nakamura:
Creating speaker independent HMM models for restricted database using STRAIGHT-TEMPO morphing. - Laure Charonnat, Michel Guitton, Joel Crestel, Gerome Allée:
Restoration of hyperbaric speech by correction of the formants and the pitch. - Juana M. Gutiérrez-Arriola, Yung-Sheng Hsiao, Juan Manuel Montero, José Manuel Pardo, Donald G. Childers:
Voice conversion based on parameter transformation. - Jilei Tian, Ramalingam Hariharan, Kari Laurila:
Noise robust two-stream auditory feature extraction method for speech recognition. - Andrew K. Halberstadt, James R. Glass:
Heterogeneous measurements and multiple classifiers for speech recognition. - Naomi Harte, Saeed Vaseghi, Ben P. Milner:
Joint recognition and segmentation using phonetically derived features and a hybrid phoneme model. - Hynek Hermansky, Sangita Sharma:
TRAPS - classifiers of temporal patterns. - John N. Holmes:
Robust measurement of fundamental frequency and degree of voicing. - John F. Holzrichter, Gregory C. Burnett, Todd J. Gable, Lawrence C. Ng:
Micropower electro-magnetic sensors for speech characterization, recognition, verification, and other applications. - Jia-Lin Shen, Jeih-Weih Hung, Lin-Shan Lee:
Robust entropy-based endpoint detection for speech recognition in noisy environments. - Jia-Lin Shen, Wen-Liang Hwang:
Statistical integration of temporal filter banks for robust speech recognition using linear discriminant analysis (LDA). - Dorota J. Iskra, William H. Edmondson:
Feature-based approach to speech recognition. - Hiroyuki Kamata, Akira Kaneko, Yoshihisa Ishida:
Periodicity emphasis of voice wave using nonlinear IIR digital filters and its applications. - Simon King, Todd A. Stephenson, Stephen Isard, Paul Taylor, Alex Strachan:
Speech recognition via phonetically featured syllables. - Jacques C. Koreman, Bistra Andreeva, William J. Barry:
Do phonetic features help to improve consonant identification in ASR? - Hisao Kuwabara:
Perceptual and acoustic properties of phonemes in continuous speech for different speaking rate. - Joohun Lee, Ki Yong Lee:
On robust sequential estimator based on t-distribution with forgetting factor for speech analysis. - Christopher John Long, Sekharajit Datta:
Discriminant wavelet basis construction for speech recognition. - Hiroshi Matsumoto, Yoshihisa Nakatoh, Yoshinori Furuhata:
An efficient mel-LPC analysis method for speech recognition. - Philip McMahon, Paul M. McCourt, Saeed Vaseghi:
Discriminative weighting of multi-resolution sub-band cepstral features for speech recognition. - Yoram Meron, Keikichi Hirose:
Separation of singing and piano sounds. - Nobuaki Minematsu, Seiichi Nakagawa:
Modeling of variations in cepstral coefficients caused by F0 changes and its application to speech processing. - Partha Niyogi, Partha Mitra, Man Mohan Sondhi:
A detection framework for locating phonetic events. - Climent Nadeu, Félix Galindo, Jaume Padrell:
On frequency averaging for spectral analysis in speech recognition. - Munehiro Namba, Yoshihisa Ishida:
Wavelet transform domain blind equalization and its application to speech analysis. - Steve Pearson:
A novel method of formant analysis and glottal inverse filtering. - Antonio J. Araujo, Vitor C. Pera, Márcio N. de Souza:
Vector quantizer acceleration for an automatic speech recognition application. - Hartmut R. Pfitzinger:
Local speech rate as a combination of syllable and phone rate. - Solange Rossato, Gang Feng, Rafael Laboissière:
Recovering gestures from speech signals: a preliminary study for nasal vowels. - Günther Ruske, Robert Faltlhauser, Thilo Pfau:
Extended linear discriminant analysis (ELDA) for speech recognition. - Ara Samouelian, Jordi Robert-Ribes, Mike Plumpe:
Speech, silence, music and noise classification of TV broadcast material. - Jean Schoentgen, Alain Soquet, Véronique Lecuit, Sorin Ciocea:
The relation between vocal tract shape and formant frequencies can be described by means of a system of coupled differential equations. - Youngjoo Suh, Kyuwoong Hwang, Oh-Wook Kwon, Jun Park:
Improving speech recognizer by broader acoustic-phonetic group classification. - C. William Thorpe:
Separation of speech source and filter by time-domain deconvolution. - Hesham Tolba, Douglas D. O'Shaughnessy:
On the application of the AM-FM model for the recovery of missing frequency bands of telephone speech. - Chang-Sheng Yang, Hideki Kasuya:
Estimation of voice source and vocal tract parameters using combined subspace-based and amplitude spectrum-based algorithm. - Fang Zheng, Zhanjiang Song, Ling Li, Wenjian Yu, Fengzhou Zheng, Wenhu Wu:
The distance measure for line spectrum pairs applied to speech recognition. - William A. Ainsworth, Charles Robert Day, Georg F. Meyer:
Improving pitch estimation with short duration speech samples. - Hideki Kawahara, Alain de Cheveigné, Roy D. Patterson:
An instantaneous-frequency-based pitch extraction method for high-quality speech transformation: revised TEMPO in the STRAIGHT-suite. - Kiyoaki Aikawa:
Speaker-independent speech recognition using micro segment spectrum integration. - Keiichi Funaki, Yoshikazu Miyanaga, Koji Tochinai:
On robust speech analysis based on time-varying complex AR model. - Hynek Hermansky, Narendranath Malayath:
Spectral basis functions from discriminant analysis. - Shin Suzuki, Takeshi Okadome, Masaaki Honda:
Determination of articulatory positions from speech acoustics by applying dynamic articulatory constraints. - Yang Li, Yunxin Zhao:
Recognizing emotions in speech using short-term and long-term features. - Arnaud Robert, Jan Eriksson:
Periphear : a nonlinear active model of the auditory periphery. - Padma Ramesh, Partha Niyogi:
The voicing feature for stop consonants: acoustic phonetic analyses and automatic speech recognition experiments. - Sankar Basu, Stéphane H. Maes:
Wavelet-based energy binning cepstral features for automatic speech recognition. - Carlos Silva, Samir Chennoukh:
Articulatory analysis using a codebook for articulatory based low bit-rate speech coding.
Spoken Language Generation and Translation 1-2
- Fang Chen, Baozong Yuan:
The modeling and realization of natural speech generation system. - Robert Eklund:
"ko tok ples ensin bilong tok pisin" or the TP-CLE: a first report from a pilot speech-to-speech translation project from Swedish to tok pisin. - Ismael García-Varea, Francisco Casacuberta, Hermann Ney:
An iterative, DP-based search algorithm for statistical machine translation. - Barbara Gawronska, David House:
Information extraction and text generation of news reports for a Swedish-English bilingual spoken dialogue system. - Joris Hulstijn, Arjan van Hessen:
Utterance generation for transaction dialogues. - Kai Ishikawa, Eiichiro Sumita, Hitoshi Iida:
Example-based error recovery method for speech translation: repairing sub-trees according to the semantic distance. - Emiel Krahmer, Mariët Theune:
Context sensitive generation of descriptions. - Lori S. Levin, Donna Gates, Alon Lavie, Alex Waibel:
An interlingua based on domain actions for machine translation of task-oriented dialogues. - Sandra Williams:
Generating pitch accents in a concept-to-speech system using a knowledge base. - Tobias Ruland, C. J. Rupp, Jörg Spilker, Hans Weber, Karsten L. Worm:
Making the most of multiplicity: a multi-parser multi-strategy architecture for the robust processing of spoken language. - Jon R. W. Yi, James R. Glass:
Natural-sounding speech synthesis using variable-length units. - Esther Klabbers, Emiel Krahmer, Mariët Theune:
A generic algorithm for generating spoken monologues. - Janet Hitzeman, Alan W. Black, Paul Taylor, Chris Mellish, Jon Oberlander:
On the use of automatically generated discourse-level information in a concept-to-speech synthesis system. - Hiyan Alshawi, Srinivas Bangalore, Shona Douglas:
Learning phrase-based head transduction models for translation of spoken utterances. - Toshiaki Fukada, Detlef Koll, Alex Waibel, Kouichi Tanigaki:
Probabilistic dialogue act extraction for concept based multilingual translation systems. - Ye-Yi Wang, Alex Waibel:
Fast decoding for statistical machine translation. - Toshiyuki Takezawa, Tsuyoshi Morimoto, Yoshinori Sagisaka, Nick Campbell, Hitoshi Iida, Fumiaki Sugaya, Akio Yokoo, Seiichi Yamamoto:
A Japanese-to-English speech translation system: ATR-MATRIX.
Segmentation, Labelling and Speech Corpora 1-4
- Julia Hirschberg, Christine H. Nakatani:
Acoustic indicators of topic segmentation. - Esther Grabe, Francis Nolan, Kimberley J. Farrar:
IVie - a comparative transcription system for intonational variation in English. - Fu-Chiang Chou, Chiu-yu Tseng, Lin-Shan Lee:
Automatic segmental and prosodic labeling of Mandarin speech database. - Stefan Rapp:
Automatic labelling of German prosody. - Matti Karjalainen, Toomas Altosaar, Miikka Huttunen:
An efficient labeling tool for the Quicksig speech database. - Harry Bratt, Leonardo Neumeyer, Elizabeth Shriberg, Horacio Franco:
Collection and detailed transcription of a speech database for development of language learning technologies. - Neeraj Deshmukh, Aravind Ganapathiraju, Andi Gleeson, Jonathan Hamaker, Joseph Picone:
Resegmentation of SWITCHBOARD. - Demetrio Aiello, Cristina Delogu, Renato de Mori, Andrea Di Carlo, Marina Nisi, Silvia Tummeacciu:
Automatic generation of visual scenarios for spoken corpora acquisition. - Mauro Cettolo, Daniele Falavigna:
Automatic detection of semantic boundaries based on acoustic and lexical knowledge. - Iman Gholampour, Kambiz Nayebi:
A new fast algorithm for automatic segmentation of continuous speech. - Akemi Iida, Nick Campbell, Soichiro Iga, Fumito Higuchi, Michiaki Yasumura:
Acoustic nature and perceptual testing of corpora of emotional speech. - Pyungsu Kang, Jiyoung Kang, Jinyoung Kim:
Korean prosodic break index labelling by a new mixed method of LDA and VQ. - Mark R. Laws, Richard Kilgour:
MOOSE: management of otago speech environment. - Fabrice Malfrère, Olivier Deroo, Thierry Dutoit:
Phonetic alignment: speech synthesis based vs. hybrid HMM/ANN. - J. Bruce Millar:
Customisation and quality assessment of spoken language description. - Claude Montacié, Marie-José Caraty:
A silence/noise/music/speech splitting algorithm. - David Pye, Nicholas J. Hollinghurst, Timothy J. Mills, Kenneth R. Wood:
Audio-visual segmentation for content-based retrieval. - Stefan Rapp, Grzegorz Dogil:
Same news is good news: automatically collecting reoccurring radio news stories. - Christel Brindöpke, Brigitte Schaffranietz:
An annotation system for melodic aspects of German spontaneous speech. - Karlheinz Stöber, Wolfgang Hess:
Additional use of phoneme duration hypotheses in automatic speech segmentation. - Amy Isard, David McKelvie, Henry S. Thompson:
Towards a minimal standard for dialogue transcripts: a new SGML architecture for the HCRC map task corpus. - Pedro J. Moreno, Christopher F. Joerg, Jean-Manuel Van Thong, Oren Glickman:
A recursive algorithm for the forced alignment of very long audio segments. - Judith M. Kessens, Mirjam Wester, Catia Cucchiarini, Helmer Strik:
The selection of pronunciation variants: comparing the performance of man and machine. - Jon Barker, Gethin Williams, Steve Renals:
Acoustic confidence measures for segmenting broadcast news. - Bryan L. Pellom, John H. L. Hansen:
A duration-based confidence measure for automatic segmentation of noise corrupted speech. - Thomas Hain, Philip C. Woodland:
Segmentation and classification of broadcast news audio. - Børge Lindberg, Robrecht Comeyne, Christoph Draxler, Francesco Senia:
Speaker recruitment methods and speaker coverage - experiences from a large multilingual speech database collection. - Estelle Campione, Jean Véronis:
A multilingual prosodic database. - Ronald A. Cole, Mike Noel, Victoria Noel:
The CSLU speaker recognition corpus. - Gregory Aist, Peggy Chan, Xuedong Huang, Li Jiang, Rebecca Kennedy, DeWitt Latimer IV, Jack Mostow, Calvin Yeung:
How effective is unsupervised data collection for children's speech recognition? - Jyh-Shing Shyuu, Jhing-Fa Wang:
An algorithm for automatic generation of Mandarin phonetic balanced corpus. - Steven Bird, Mark Y. Liberman:
Towards a formal framework for linguistic annotations. - Toomas Altosaar, Martti Vainio:
Forming generic models of speech for uniform database access.
Large Vocabulary Continuous Speech Recognition 1-6
- Gary D. Cook, Tony Robinson, James Christie:
Real-time recognition of broadcast news. - Ha-Jin Yu, Hoon Kim, Jae-Seung Choi, Joon-Mo Hong, Kew-Suh Park, Jong-Seok Lee, Hee-Youn Lee:
Automatic recognition of Korean broadcast news speech. - James R. Glass, Timothy J. Hazen:
Telephone-based conversational speech recognition in the JUPITER domain. - Hsiao-Wuen Hon, Yun-Cheng Ju, Keiko Otani:
Japanese large-vocabulary continuous speech recognition system based on microsoft whisper. - Jean-Luc Gauvain, Lori Lamel, Gilles Adda:
Partitioning and transcription of broadcast news data. - Hajime Tsukada, Hirofumi Yamamoto, Toshiyuki Takezawa, Yoshinori Sagisaka:
Grammatical word graph re-generation for spontaneous speech recognition. - Norimichi Yodo, Kiyohiro Shikano, Satoshi Nakamura:
Compression algorithm of trigram language models based on maximum likelihood estimation. - Ulla Uebler, Heinrich Niemann:
Morphological modeling of word classes for language models. - Imed Zitouni, Kamel Smaïli, Jean Paul Haton, Sabine Deligne, Frédéric Bimbot:
A comparative study between polyclass and multiclass language models. - Dietrich Klakow:
Log-linear interpolation of language models. - Philip Clarkson, Tony Robinson:
The applicability of adaptive language modelling for the broadcast news task. - Long Nguyen, Richard M. Schwartz:
The BBN single-phonetic-tree fast-match algorithm. - Akinobu Lee, Tatsuya Kawahara, Shuji Doshita:
An efficient two-pass search algorithm using word trellis index. - Mike Schuster:
Nozomi - a fast, memory-efficient stack decoder for LVCSR. - Thomas Kemp, Alex Waibel:
Reducing the OOV rate in broadcast news speech recognition. - Michiel Bacchiani, Mari Ostendorf:
Using automatically-derived acoustic sub-word units in large vocabulary speech recognition. - Don McAllaster, Lawrence Gillick, Francesco Scattone, Michael Newman:
Fabricating conversational speech data with acoustic models: a program to examine model-data mismatch. - Wu Chou, Wolfgang Reichl:
High resolution decision tree based acoustic modeling beyond CART. - Thomas Kemp, Alex Waibel:
Unsupervised training of a speech recognizer using TV broadcasts. - Clark Z. Lee, Douglas D. O'Shaughnessy:
A new method to achieve fast acoustic matching for speech recognition. - Jacques Duchateau, Kris Demuynck, Dirk Van Compernolle, Patrick Wambacq:
Improved parameter tying for efficient acoustic model evaluation in large vocabulary continuous speech recognition. - Ananth Sankar:
A new look at HMM parameter tying for large vocabulary speech recognition. - Ramesh A. Gopinath, Bhuvana Ramabhadran, Satya Dharanipragada:
Factor analysis invariant to linear transformations of data. - Akio Ando, Akio Kobayashi, Toru Imai:
A thesaurus-based statistical language model for broadcast news transcription. - Sreeram V. Balakrishnan:
Effect of task complexity on search strategies for the motorola lexicus continuous speech recognition system. - Dhananjay Bansal, Mosur K. Ravishankar:
New features for confidence annotation. - Jerome R. Bellegarda:
Multi-Span statistical language modeling for large vocabulary speech recognition. - Rathinavelu Chengalvarayan:
Maximum-likelihood updates of HMM duration parameters for discriminative continuous speech recognition. - Noah Coccaro, Daniel Jurafsky:
Towards better integration of semantic predictors in statistical language modeling. - Julio Pastor, José Colás, Rubén San Segundo, José Manuel Pardo:
An asymmetric stochastic language model based on multi-tagged words. - Vassilios Digalakis, Leonardo Neumeyer, Manolis Perakakis:
Product-code vector quantization of cepstral parameters for speech recognition over the WWW. - Bernard Doherty, Saeed Vaseghi, Paul M. McCourt:
Context dependent tree based transforms for phonetic speech recognition. - Michael T. Johnson, Mary P. Harper, Leah H. Jamieson:
Interfacing acoustic models with natural language processing systems. - Photina Jaeyun Jang, Alexander G. Hauptmann:
Hierarchical cluster language modeling with statistical rule extraction for rescoring n-best hypotheses during speech decoding. - Atsuhiko Kai, Yoshifumi Hirose, Seiichi Nakagawa:
Dealing with out-of-vocabulary words and speech disfluencies in an n-gram based speech understanding system. - Tetsunori Kobayashi, Yosuke Wada, Norihiko Kobayashi:
Source-extended language model for large vocabulary continuous speech recognition. - Akio Kobayashi, Kazuo Onoe, Toru Imai, Akio Ando:
Time dependent language model for broadcast news transcription and its post-correction. - Jacques C. Koreman, William J. Barry, Bistra Andreeva:
Exploiting transitions and focussing on linguistic properties for ASR. - Raymond Lau, Stephanie Seneff:
A unified framework for sublexical and linguistic modelling supporting flexible vocabulary speech understanding. - Lalit R. Bahl, Steven V. De Gennaro, Pieter de Souza, Edward A. Epstein, J. M. Le Roux, Burn L. Lewis, Claire Waast:
A method for modeling liaison in a speech recognition system for French. - Fu-Hua Liu, Michael Picheny:
On variable sampling frequencies in speech recognition. - Kristine W. Ma, George Zavaliagkos, Rukmini Iyer:
Pronunciation modeling for large vocabulary conversational speech recognition. - Sankar Basu, Abraham Ittycheriah, Stéphane H. Maes:
Time shift invariant speech recognition. - José B. Mariño, Pau Pachès-Leal, Albino Nogueiras:
The demiphone versus the triphone in a decision-tree state-tying framework. - Shinsuke Mori, Masafumi Nishimura, Nobuyasu Itoh:
Word clustering for a word bi-gram model. - João Paulo Neto, Ciro Martins, Luís B. Almeida:
A large vocabulary continuous speech recognition hybrid system for the portuguese language. - Mukund Padmanabhan, Bhuvana Ramabhadran, Sankar Basu:
Speech recognition performance on a new voicemail transcription task. - Sira E. Palazuelos, Santiago Aguilera, José Rodrigo, Juan Ignacio Godino-Llorente:
Grammatical and statistical word prediction system for Spanish integrated in an aid for people with disabilities. - Kishore Papineni, Satya Dharanipragada:
Segmentation using a maximum entropy approach. - Adam L. Berger, Harry Printz:
Recognition performance of a large-scale dependency grammar language model. - Ganesh N. Ramaswamy, Harry Printz, Ponani S. Gopalakrishnan:
A bootstrap technique for building domain-dependent language models. - Joan-Andreu Sánchez, José-Miguel Benedí:
Estimation of the probability distributions of stochastic context-free grammars from the k-best derivations. - Ananth Sankar:
Robust HMM estimation with Gaussian merging-splitting and tied-transform HMMs. - Kristie Seymore, Stanley F. Chen, Ronald Rosenfeld
:
Nonlinear interpolation of topic models for language model adaptation.