![](https://dblp.dagstuhl.de/img/logo.ua.320x120.png)
![](https://dblp.dagstuhl.de/img/dropdown.dark.16x16.png)
![](https://dblp.dagstuhl.de/img/peace.dark.16x16.png)
Остановите войну!
for scientists:
![search dblp search dblp](https://dblp.dagstuhl.de/img/search.dark.16x16.png)
![search dblp](https://dblp.dagstuhl.de/img/search.dark.16x16.png)
default search action
ASRU 2011: Waikoloa, HI, USA
- David Nahamoo, Michael Picheny:
2011 IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU 2011, Waikoloa, HI, USA, December 11-15, 2011. IEEE 2011, ISBN 978-1-4673-0365-1
Acoustic Modeling
- Simon Wiesler, Ralf Schlüter
, Hermann Ney:
A convergence analysis of log-linear training and its application to speech recognition. 1-6 - Muhammad Ali Tahir, Ralf Schlüter
, Hermann Ney:
Discriminative splitting of Gaussian/log-linear mixture HMMs for speech recognition. 7-11 - Ryuki Tachibana, Takashi Fukuda, Upendra V. Chaudhari, Bhuvana Ramabhadran, Puming Zhan:
Frame-level AnyBoost for LVCSR with the MMI Criterion. 12-17 - Shi-Xiong Zhang, Mark J. F. Gales:
Extending noise robust structured support vector machines to larger vocabulary tasks. 18-23 - Frank Seide, Gang Li, Xie Chen, Dong Yu:
Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription. 24-29 - Tara N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran, Petr Fousek, Petr Novák, Abdel-rahman Mohamed:
Making Deep Belief Networks effective for large vocabulary continuous speech recognition. 30-35 - Martin Wöllmer, Björn W. Schuller
, Gerhard Rigoll:
A novel bottleneck-BLSTM front-end for feature-level context modeling in conversational speech recognition. 36-41 - Karel Veselý, Martin Karafiát
, Frantisek Grézl:
Convolutive Bottleneck Network features for LVCSR. 42-47 - Wen-Lin Zhang, Wei-Qiang Zhang, Bi-Cheng Li:
Speaker adaptation based on speaker-dependent eigenphone estimation. 48-52 - Peder A. Olsen, Jing Huang, Vaibhava Goel
, Steven J. Rennie:
Sparse Maximum A Posteriori adaptation. 53-58 - Tara N. Sainath, David Nahamoo, Dimitri Kanevsky, Bhuvana Ramabhadran, Parikshit M. Shah:
A convex hull approach to sparse representations for exemplar-based speech recognition. 59-64 - George Saon, Jen-Tzung Chien
:
Some properties of Bayesian sensing hidden Markov models. 65-70 - Dan Gillick, Larry Gillick, Steven Wegmann:
Don't multiply lightly: Quantifying problems with the acoustic model assumptions in speech recognition. 71-76 - Rohit Prabhavalkar
, Eric Fosler-Lussier, Karen Livescu:
A factored conditional random field model for articulatory feature forced transcription. 77-82 - Hiroshi Fujimura, Masanobu Nakamura, Yusuke Shinohara, Takashi Masuko:
N-Best rescoring by adaboost phoneme classifiers for isolated word recognition. 83-88 - Hung-An Chang, James R. Glass:
Multi-level context-dependent acoustic modeling for automatic speech recognition. 89-94 - Matthias Paulik, Panchi Panchapagesan:
Leveraging large amounts of loosely transcribed corporate videos for acoustic model training. 95-100
ASR Robustness
- Jort F. Gemmeke, Hugo Van hamme
:
An hierarchical exemplar-based sparse model of speech, with an application to ASR. 101-106 - Khe Chai Sim, Minh-Thang Luong:
A Trajectory-based Parallel Model Combination with a unified static and dynamic parameter compensation for noisy speech recognition. 107-112 - Yongqiang Wang, Mark J. F. Gales:
Improving reverberant VTS for hands-free robust speech recognition. 113-118 - Anton Ragni, Mark J. F. Gales:
Derivative kernels for noise robust ASR. 119-124 - Rogier C. van Dalen, Mark J. F. Gales:
A variational perspective on noise-robust speech recognition. 125-130 - Vikramjit Mitra, Hosung Nam, Carol Y. Espy-Wilson:
Robust speech recognition using articulatory gestures in a Dynamic Bayesian Network framework. 131-136 - Steven J. Rennie, Pierre L. Dognin, Petr Fousek:
Matched-condition robust Dynamic Noise Adaptation. 137-140 - Mickael Rouvier, Mohamed Bouallegue, Driss Matrouf, Georges Linarès:
Factor analysis based session variability compensation for Automatic Speech Recognition. 141-145 - Michael L. Seltzer, Alex Acero:
Factored adaptation for separable compensation of speaker and environmental variability. 146-151 - Martin Karafiát
, Lukás Burget
, Pavel Matejka, Ondrej Glembek, Jan Cernocký
:
iVector-based discriminative adaptation for automatic speech recognition. 152-157 - Daniel Povey, Geoffrey Zweig, Alex Acero:
Speaker adaptation with an Exponential Transform. 158-163 - Sid-Ahmed Selouani:
Evolutionary discriminative speaker adaptation. 164-168 - Arata Itoh, Sunao Hara, Norihide Kitaoka, Kazuya Takeda:
Robust seed model training for speaker adaptation using pseudo-speaker features generated by inverse CMLLR transformation. 169-172 - Yasunari Obuchi, Ryu Takeda
, Masahito Togami:
Bidirectional OM-LSA speech estimator for noise robust speech recognition. 173-178 - Ken'ichi Kumatani, John W. McDonough, Bhiksha Raj:
Maximum kurtosis beamforming with a subspace filter for distant speech recognition. 179-184 - Cemil Demir, Ali Taylan Cemgil
, Murat Saraclar:
Gain estimation approaches in catalog-based single-channel speech-music separation. 185-190 - Hiroko Murakami, Koichi Shinoda, Sadaoki Furui:
Designing text corpus using phone-error distribution for acoustic modeling. 191-195
Language Modeling and ASR Systems
- Tomás Mikolov, Anoop Deoras, Daniel Povey, Lukás Burget
, Jan Cernocký
:
Strategies for training large scale neural network language models. 196-201 - Hasim Sak, Murat Saraclar, Tunga Gungor
:
Discriminative reranking of ASR hypotheses with morpholexical and N-best-list features. 202-207 - Hong-Kwang Jeff Kuo, Ebru Arisoy, Lidia Mangu, George Saon:
Minimum Bayes risk discriminative language models for Arabic speech recognition. 208-213 - Ariya Rastrow, Mark Dredze
, Sanjeev Khudanpur:
Efficient discriminative training of long-span language models. 214-219 - Ariya Rastrow, Mark Dredze
, Sanjeev Khudanpur:
Adapting n-gram maximum entropy language models with conditional entropy regularization. 220-225 - Puyang Xu, Sanjeev Khudanpur, Asela Gunawardana:
Randomized maximum entropy language models. 226-230 - Jia Cui, Stanley F. Chen, Bowen Zhou:
Efficient representation and fast look-up of Maximum Entropy language models. 231-236 - Stanley F. Chen, Abhinav Sethy, Bhuvana Ramabhadran:
Pruning exponential language models. 237-242 - Timo Mertens, Stephanie Seneff:
Subword-based automatic lexicon learning for Speech Recognition. 243-248 - Upendra V. Chaudhari, Xiaodong Cui, Bowen Zhou, Rong Zhang:
An investigation of heuristic, manual and statistical pronunciation derivation for Pashto. 249-253 - Timo Mertens, Kit Thambiratnam, Frank Seide:
Subword-based multi-span pronunciation adaptation for recognizing accented speech. 254-259 - Horia Cucu
, Laurent Besacier, Corneliu Burileanu, Andi Buzo
:
Investigating the role of machine translated text in ASR domain adaptation: Unsupervised and semi-supervised methods. 260-265 - Hagen Soltau, Lidia Mangu, Fadi Biadsy:
From Modern Standard Arabic to Levantine ASR: Leveraging GALE for dialects. 266-271 - Lidia Mangu, Hong-Kwang Kuo, Stephen M. Chu, Brian Kingsbury, George Saon, Hagen Soltau, Fadi Biadsy:
The IBM 2011 GALE Arabic speech transcription system. 272-277 - Fethi Bougares, Yannick Estève, Paul Deléglise, Georges Linarès:
Bag of n-gram driven decoding for LVCSR system harnessing. 278-282 - Izhak Shafran, Richard Sproat, Mahsa Yarmohammadi, Brian Roark:
Efficient determinization of tagged word lattices using categorial and lexicographic semirings. 283-288
TTS, Dialog and MLSP
- William Yang Wang, Kallirroi Georgila:
Automatic detection of unnatural word-level segments in unit-selection speech synthesis. 289-294 - Chai Wutiwiwatchai, Ausdang Thangthai, Ananlada Chotimongkol, Chatchawarn Hansakunbuntheung
, Nattanun Thatphithakkul:
Accent level adjustment in bilingual Thai-English text-to-speech synthesis. 295-299 - Jerome R. Bellegarda:
Sentiment analysis of text-to-speech input using latent affective mapping. 300-305 - José Lopes
, Maxine Eskénazi, Isabel Trancoso
:
Towards choosing better primes for spoken dialog systems. 306-311 - Milica Gasic, Filip Jurcícek, Blaise Thomson, Kai Yu, Steve J. Young:
On-line policy optimisation of spoken dialogue systems via live interaction with human subjects. 312-317 - Toyomi Meguro, Yasuhiro Minami, Ryuichiro Higashinaka, Kohji Dohsaka:
Wizard of Oz evaluation of listening-oriented dialogue control using POMDP. 318-323 - Jingjing Liu, Stephanie Seneff:
A dialogue system for accessing drug reviews. 324-329 - Ryuichiro Higashinaka, Noriaki Kawamae, Kugatsu Sadamitsu, Yasuhiro Minami, Toyomi Meguro, Kohji Dohsaka, Hirohito Inagaki:
Building a conversational model from two-tweets. 330-335 - Mitsuru Takaoka, Hiromitsu Nishizaki, Yoshihiro Sekiguchi:
Utterance verification using garbage words for a hospital appointment system with speech interface. 336-341 - Shuai Huang, Damianos G. Karakos, Glen A. Coppersmith, Kenneth Ward Church, Sabato Marco Siniscalchi
:
Bootstrapping a spoken language identification system using unsupervised integrated sensing and processing decision trees. 342-347 - David Imseng, Ramya Rasipuram, Mathew Magimai-Doss:
Fast and flexible Kullback-Leibler divergence based acoustic modeling for non-native speech recognition. 348-353 - Yanmin Qian, Ji Xu, Daniel Povey, Jia Liu:
Strategies for using MLP based features with limited target-language training data. 354-358 - Frantisek Grézl, Martin Karafiát
, Milos Janda:
Study of probabilistic and Bottle-Neck features in multilingual environment. 359-364 - Liang Lu, Arnab Ghoshal, Steve Renals
:
Regularized subspace Gaussian mixture models for cross-lingual speech recognition. 365-370 - Christian Plahl, Ralf Schlüter
, Hermann Ney:
Cross-lingual portability of Chinese and english neural network features for French and German LVCSR. 371-376 - Luis Javier Rodríguez
, Mikel Peñagarikano
, Amparo Varona
, Mireia Díez
, Germán Bordel, David Martínez González
, Jesús Antonio Villalba López
, Antonio Miguel, Alfonso Ortega
, Eduardo Lleida
, Alberto Abad
, Oscar Koller, Isabel Trancoso
, Paula Lopez-Otero
, Laura Docío Fernández
, Carmen García-Mateo
, Rahim Saeidi
, Mehdi Soufifar, Tomi Kinnunen, Torbjørn Svendsen
, Pasi Fränti:
Multi-site heterogeneous system fusions for the Albayzin 2010 Language Recognition Evaluation. 377-382
Spoken Document Retrieval and Spoken Language Understanding
- Tsung-wei Tu, Hung-yi Lee
, Lin-Shan Lee:
Improved spoken term detection using support vector machines with acoustic and context features from pseudo-relevance feedback. 383-388 - Berlin Chen, Pei-Ning Chen, Kuan-Yu Chen:
Query modeling for spoken document retrieval. 389-394 - Timothy J. Hazen, Man-Hung Siu, Herbert Gish, Steve Lowe, Arthur Chan:
Topic modeling for spoken documents using only phonetic information. 395-400 - Aren Jansen, Benjamin Van Durme:
Efficient spoken term discovery using randomized algorithms. 401-406 - Damianos G. Karakos, Mark Dredze
, Ken Ward Church, Aren Jansen, Sanjeev Khudanpur:
Estimating document frequencies in a speech corpus. 407-412 - Weiqun Xu, Changchun Bao, Yali Li, Jielin Pan, Yonghong Yan:
Robust understanding of spoken Chinese through character-based tagging and prior knowledge exploitation. 413-418 - Dilek Hakkani-Tür, Gökhan Tür, Larry P. Heck, Asli Celikyilmaz, Ashley Fidler, Dustin Hillard, Rukmini Iyer, Sarangarajan Parthasarathy:
Employing web search query click logs for multi-domain spoken language understanding. 419-424 - Asli Celikyilmaz, Dilek Hakkani-Tür, Gökhan Tür, Ashley Fidler, Dustin Hillard:
Exploiting distance based similarity in topic models for user intent detection. 425-430 - Liva Ralaivola, Benoît Favre, Pierre Gotab, Frédéric Béchet, Géraldine Damnati:
Applying Multiclass Bandit algorithms to call-type classification. 431-436 - Babak Loni, Seyedeh Halleh Khoshnevis, Pascal Wiggers
:
Latent semantic analysis for question classification with neural networks. 437-442 - Bin Zhang, Alex Marin, Brian Hutchinson
, Mari Ostendorf:
Analyzing conversations using rich phrase patterns. 443-448 - Anthony P. Stark, Izhak Shafran, Jeffrey A. Kaye
:
Supervised and unsupervised feature selection for inferring social nature of telephone conversations from their content. 449-454 - Yangyang Shi, Pascal Wiggers
, Catholijn M. Jonker:
Socio-situational setting classification based on language use. 455-460 - Klaus Zechner, Xiaoming Xi, Lei Chen:
Evaluating prosodic features for automated scoring of non-native read speech. 461-466 - Di Lu, Takuya Nishimoto, Nobuaki Minematsu:
Decision of response timing for incremental speech recognition with reinforcement learning. 467-472
New Applications in Speech Processing
- Lei Chen:
Applying feature bagging for more accurate and robust automated speaking assessment. 473-477 - Tobias Bocklet, Elmar Nöth
, Georg Stemmer
, Hana Ruzickova, Jan Rusz
:
Detection of persons with Parkinson's disease by acoustic, vocal, and prosodic analysis. 478-483 - Emily Tucker Prud'hommeaux, Brian Roark:
Alignment of spoken narratives for automated neuropsychological assessment. 484-489 - Jiahong Yuan, Mark Liberman:
Automatic detection of "g-dropping" in American English using forced alignment. 490-493 - Shunta Ishii, Tomoki Toda
, Hiroshi Saruwatari, Sakriani Sakti, Satoshi Nakamura:
Blind noise suppression for Non-Audible Murmur recognition with stereo signal processing. 494-499 - Chao Zhang, Yi Liu, Chin-Hui Lee:
Detection-based accented speech recognition using articulatory features. 500-505 - Alfonso M. Canterla, Magne Hallstein Johnsen:
Minimum detection error training of subword detectors. 506-511 - Mohamed Bouallegue, Driss Matrouf, Mickael Rouvier, Georges Linarès:
Subspace Gaussian Mixture Models for vectorial HMM-states representation. 512-516 - Guangpu Huang, Meng Joo Er:
A novel neural-based pronunciation modeling method for robust speech recognition. 517-522 - Zixing Zhang, Felix Weninger, Martin Wöllmer, Björn W. Schuller
:
Unsupervised learning in cross-corpus acoustic emotion recognition. 523-528 - Sankaranarayanan Ananthakrishnan, Aravind Namandi Vembu, Rohit Prasad:
Model-based parametric features for emotion recognition from speech. 529-534 - Jason D. Williams
, I. Dan Melamed, Tirso Alonso, Barbara Hollister, Jay G. Wilpon:
Crowd-sourcing for difficult transcription of speech. 535-540 - Kengo Ohta, Masatoshi Tsuchiya
, Seiichi Nakagawa:
Detection of precisely transcribed parts from inexact transcribed corpus. 541-546 - Md. Jahangir Alam, Tomi Kinnunen, Patrick Kenny, Pierre Ouellet, Douglas D. O'Shaughnessy:
Multi-taper MFCC features for speaker verification using I-vectors. 547-552 - Ekaterina Gonina, Gerald Friedland, Henry Cook, Kurt Keutzer:
Fast speaker diarization using a high-level scripting language. 553-558 - Xinhui Zhou, Daniel Garcia-Romero, Ramani Duraiswami
, Carol Y. Espy-Wilson, Shihab A. Shamma:
Linear versus mel frequency cepstral coefficients for speaker recognition. 559-564
![](https://dblp.dagstuhl.de/img/cog.dark.24x24.png)
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.