


default search action
IEEE Transactions on Audio, Speech & Language Processing, Volume 14
Volume 14, Number 1, January 2006
- Lie Lu

, Dan Liu, HongJiang Zhang:
Automatic mood detection and tracking of music audio signals. 5-18 - Ning Ma, Martin Bouchard

, Rafik A. Goubran:
Speech enhancement using a masking threshold constrained Kalman filter and its heuristic implementations. 19-32 - James D. Gordy, Rafik A. Goubran:

On the perceptual performance limitations of echo cancellers in wideband telephony. 33-42 - Marcus Holmberg, David Gelbart, Werner Hemmert:

Automatic speech recognition with an adaptation model motivated by auditory processing. 43-49 - Thomas Blumensath

, Mike E. Davies:
Sparse and shift-Invariant representations of music. 50-57 - Sue Harding, Jon P. Barker

, Guy J. Brown
:
Mask estimation for missing data speech recognition based on statistics of binaural interaction. 58-67 - Slim Essid, Gaël Richard, Bertrand David:

Instrument recognition in polyphonic music based on automatic taxonomies. 68-80 - Fabian Mörchen, Alfred Ultsch, Michael Thies, Ingo Lohken:

Modeling timbre distance with temporal statistics from polyphonic music. 81-90 - Emmanuel Vincent:

Musical source separation using time-frequency source priors. 91-98 - Mads Græsbøll Christensen

, Søren Holdt Jensen:
On perceptual distortion minimization and nonlinear least-squares frequency estimation. 99-109 - Alberto González

, Maria de Diego
, Miguel Ferrer
, Gema Pinero
:
Multichannel active noise equalization of interior noise. 110-122 - Yoichi Hinamoto, Hideaki Sakai:

Analysis of the filtered-X LMS algorithm and a related new algorithm for active control of multitonal noise. 123-130 - Norman H. Adams, Mark A. Bartsch, Gregory H. Wakefield:

Note segmentation and quantization for music information retrieval. 131-141 - Norman D. Cook, Takashi X. Fujisawa, Kazuaki Takami:

Evaluation of the affective valence of speech using pitch substructure. 142-151 - Anand D. Subramaniam, William R. Gardner, Bhaskar D. Rao:

Iterative joint source-channel decoding of speech spectrum parameters over an additive white Gaussian noise channel. 152-162 - Sriram Srinivasan

, Jonas Samuelsson, W. Bastiaan Kleijn
:
Codebook driven short-term predictor parameter estimation for speech enhancement. 163-176 - Yoshifumi Nagata, Toyota Fujioka, Masato Abe:

Speech enhancement based on auto gain control. 177-190 - Laurent Benaroya, Frédéric Bimbot, Rémi Gribonval:

Audio source separation with a single sensor. 191-199 - Kostas Kokkinakis, Asoke K. Nandi:

Multichannel blind deconvolution for source separation in convolutive mixtures of speech. 200-212 - Narendra K. Gupta, Gökhan Tür

, Dilek Hakkani-Tür
, Srinivas Bangalore, Giuseppe Riccardi, Mazin Gilbert:
The AT&T spoken language understanding system. 213-222 - Ben Milner, Alastair Bruce James:

Robust speech recognition over mobile and IP networks in burst-like packet loss. 223-231 - Ken Chen

, Mark Hasegawa-Johnson, Aaron Cohen, Sarah Borys, Sung-Suk Kim, Jennifer Cole
, Jeung-Yoon Choi:
Prosody dependent speech recognition on radio news corpus of American English. 232-245 - Néstor Becerra Yoma, Carlos Molina, Jorge F. Silva

, Carlos Busso
:
Modeling, estimating, and compensating low-bit rate coding distortion in speech recognition. 246-255 - Li Deng, Dong Yu, Alex Acero

:
A bidirectional target-filtering model of speech coarticulation and reduction: two-stage implementation for phonetic recognition. 256-265 - Chung-Hsien Wu

, Yu-Hsien Chiu, Chi-Jiun Shia, Chun-Yu Lin:
Automatic segmentation and identification of mixed-language speech using delta-BIC and LSA-based GMMs. 266-276 - Tomi Kinnunen, Evgeny Karpov, Pasi Fränti:

Real-time speaker identification and verification. 277-288 - Yang Shao, DeLiang Wang:

Model-based sequential organization in cochannel speech. 289-298 - Christof Faller:

Parametric multichannel audio coding: synthesis of coherence cues. 299-310 - Renat Vafin, W. Bastiaan Kleijn

:
Rate-distortion optimized quantization in multistage audio coding. 311-320 - Antti J. Eronen, Vesa T. Peltonen, Juha T. Tuomi, Anssi Klapuri, Seppo Fagerlund, Timo Sorsa, Gaëtan Lorho, Jyri Huopaniemi:

Audio-based context recognition. 321-329 - Wei-Ho Tsai, Hsin-Min Wang

:
Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals. 330-341 - Anssi Klapuri, Antti J. Eronen, Jaakko Astola:

Analysis of the meter of acoustic musical signals. 342-355 - Vaibhava Goel

, Shankar Kumar, William Byrne:
Corrections to "Segmental minimum Bayes-risk decoding for automatic speech recognition". 356-357
Volume 14, Number 2, March 2006
- Satoshi Nakamura, Konstantin Markov, Hiromi Nakaiwa, Gen-ichiro Kikui, Hisashi Kawai, Takatoshi Jitsuhiro, Jinsong Zhang

, Hirofumi Yamamoto, Eiichiro Sumita, Seiichi Yamamoto:
The ATR multilingual speech-to-speech translation system. 365-376 - Liang Gu, Yuqing Gao, Fu-Hua Liu, Michael Picheny:

Concept-based speech-to-speech translation using maximum entropy models for statistical natural concept generation. 377-392 - Yasuhiro Akiba, Kenji Imamura, Eiichiro Sumita, Hiromi Nakaiwa, Shun'ichi Yamamoto, Hiroshi G. Okuno

:
Using multiple edit distances to automatically grade outputs from Machine translation systems. 393-402 - Tanja Schultz

, Alan W. Black, Stephan Vogel, Monika Woszczyna:
Flexible speech translation systems. 403-411 - Alan Davis, Sven Nordholm

, Roberto Togneri
:
Statistical voice activity detection using low-variance spectrum estimation and an adaptive threshold. 412-424 - Li Deng, Alex Acero

, Issam Bazzi:
Tracking vocal tract resonances using a quantized nonlinear function embeddedin a temporal constraint. 425-434 - Kamran Mustafa, Ian C. Bruce

:
Robust formant tracking for continuous speech with speaker variability. 435-444 - Huiqun Deng, Rabab K. Ward, Michael P. Beddoes, Murray Hodgson:

A new method for obtaining accurate estimates of vocal-tract filters and glottal waves from vowel sounds. 445-455 - Mike Brookes

, Patrick A. Naylor
, Jón Guðnason
:
A quantitative assessment of group delay methods for identifying glottal closures in voiced speech. 456-466 - Ran D. Zilca, Brian Kingsbury, Jirí Navrátil, Ganesh N. Ramaswamy:

Pseudo pitch synchronous analysis of speech with applications to speaker recognition. 467-478 - Saeed Gazor

, Reza Rashidi Far:
Adaptive maximum windowed likelihood multicomponent AM-FM signal decomposition. 479-491 - Qiang Fu, Peter Murphy:

Robust glottal source estimation based on joint source-filter model optimization. 492-501 - Etan Fisher, Joseph Tabrikian

, Shlomo Dubnov
:
Generalized likelihood ratio test for voiced-unvoiced decision in noisy speech using the harmonic model. 502-510 - Doroteo T. Toledano

, Jesús Gómez Villardebó, Luis A. Hernández Gómez:
Initialization, training, and context-dependency in HMM-based formant tracking. 511-523 - Anand D. Subramaniam, William R. Gardner, Bhaskar D. Rao:

Low-complexity source coding using Gaussian mixture models, lattice vector quantization, and recursive coding with application to speech spectrum quantization. 524-532 - Thomas F. Quatieri, Kevin Brady, D. Messing, Joseph P. Campbell, William M. Campbell, Michael S. Brandstein, Clifford J. Weinstein, John D. Tardelli, Paul D. Gatewood:

Exploiting nonacoustic sensors for speech encoding. 533-544 - Hui Dong, Jerry D. Gibson:

Structures for SNR scalable speech coding. 545-557 - Udaya Bhaskar, Kumar Swaminathan:

Low bit-rate voice compression based on frequency domain interpolative techniques. 558-576 - Harald Gustafsson, Ulf A. Lindgren, Ingvar Claesson:

Low-complexity feature-mapped speech bandwidth extension. 577-588 - Olivier Pietquin

, Thierry Dutoit:
A probabilistic framework for dialog simulation and optimal strategy learning. 589-599 - Bojana Gajic, Kuldip K. Paliwal

:
Robust speech recognition in noisy environments based on subband spectral centroid histograms. 600-608 - Hossein Najaf-Zadeh, Peter Kabal:

Perceptual coding of narrow-band audio signals at low rates. 609-622 - Ashish Aggarwal, Shankar L. Regunathan, Kenneth Rose:

A trellis-based optimal parameter value selection for audio coding. 623-633 - Pongtep Angkititrakul, John H. L. Hansen:

Advances in phone-based modeling for automatic accent classification. 634-646 - Chung-Hsien Wu

, Chia-Hsin Hsieh:
Multiple change-point audio segmentation and classification using an MDL-based Gaussian model. 647-657 - Ngwa A. Shusina, Boaz Rafaely

:
Unbiased adaptive feedback cancellation in hearing aids by closed-loop identification. 658-665 - Hiroshi Saruwatari, Toshiya Kawamura, Tsuyoki Nishikawa, Akinobu Lee, Kiyohiro Shikano:

Blind source separation based on a fast-convergence algorithm combining ICA and beamforming. 666-678 - Ali Taylan Cemgil

, Hilbert J. Kappen, David Barber
:
A generative model for music transcription. 679-694 - Mitsuko Aramaki, Richard Kronland-Martinet

:
Analysis-synthesis of impact sounds by real-time dynamic filtering. 695-705 - Kelvin Chee-Mun Lee, Woon-Seng Gan

:
Bandwidth-efficient recursive pth-order equalization for correcting baseband distortion in parametric loudspeakers. 706-710 - L. E. Rees, Stephen J. Elliott:

Adaptive algorithms for active sound-profiling. 711-719 - Muhammad Tahir Akhtar

, Masahide Abe, Masayuki Kawamata:
A new variable step size LMS algorithm-based method for improved online secondary path modeling in active noise control systems. 720-726 - Thomas Hain

, Philip C. Woodland, Gunnar Evermann, Mark J. F. Gales, Xunying Liu, Gareth L. Moore, Daniel Povey, Lan Wang:
Corrections to "Automatic Transcription of Conversational Telephone Speech". 727-727
Volume 14, Number 3, May 2006
- S. Ramamohan, Samarendra Dandapat:

Sinusoidal model-based analysis and classification of stressed speech. 737-746 - Joon-Hyuk Chang, Nam Soo Kim:

A new structural approach in system identification with generalized analysis-by-synthesis for robust speech coding. 747-751 - Christoffer Asgaard Rødbro, Jesper Jensen, Richard Heusdens:

Rate-distortion optimal time-segmentation and redundancy selection for VoIP. 752-763 - Volodya Grancharov, Jonas Samuelsson, W. Bastiaan Kleijn

:
On causal algorithms for speech enhancement. 764-773 - Mingyang Wu, DeLiang Wang:

A two-stage algorithm for one-microphone reverberant speech enhancement. 774-784 - Andy W. H. Khong, Patrick A. Naylor

:
Stereophonic acoustic echo cancellation employing selective-tap adaptive algorithms. 785-796 - Jen-Tzung Chien

, Chih-Hsien Huang:
Aggregate a posteriori linear regression adaptation. 797-807 - Jeih-Weih Hung, Lin-Shan Lee:

Optimization of temporal filters for constructing robust features in speech recognition. 808-832 - Ji Ming:

Noise compensation for speech recognition with arbitrary additive noise. 833-844 - Florian Hilger, Hermann Ney:

Quantile based histogram equalization for noise robust large vocabulary speech recognition. 845-854 - Shinji Watanabe

, Atsushi Sako, Atsushi Nakamura:
Automatic determination of acoustic model topology using variational Bayesian estimation and clustering for large vocabulary continuous speech recognition. 855-872 - Hong-Kwang Jeff Kuo, Yuqing Gao:

Maximum entropy direct models for speech recognition. 873-881 - Khe Chai Sim, Mark J. F. Gales:

Minimum phone error training of precision matrix models. 882-889 - Jorge F. Silva

, Shrikanth S. Narayanan:
Average divergence distance as a statistical discrimination measure for hidden Markov models. 890-906 - Rongqing Huang, John H. L. Hansen:

Advances in unsupervised audio classification and segmentation for the broadcast news and NGSW corpora. 907-919 - Nima Mesgarani, Malcolm Slaney

, Shihab A. Shamma:
Discrimination of speech from nonspeech based on multiscale spectro-temporal Modulations. 920-930 - R. Sant'Ana, Rosangela Coelho

, Abraham Alcaim:
Text-independent speaker recognition based on the Hurst parameter and the multidimensional fractional Brownian motion model. 931-940 - Enrique Vidal, Francisco Casacuberta

, Luis Rodríguez, Jorge Civera
, Carlos D. Martínez-Hinarejos
:
Computer-assisted translation using speech recognition. 941-951 - Athanasios Mouchtaris, Jan Van der Spiegel, Paul Mueller:

Nonparallel training for voice conversion based on a parameter adaptation approach. 952-963 - Jack Mullen, David M. Howard

, Damian T. Murphy:
Waveguide physical modeling of vocal tract acoustics: flexible formant bandwidth control from increased model dimensionality. 964-971 - K. Sreenivasa Rao, B. Yegnanarayana:

Prosody modification using instants of significant excitation. 972-980 - Ki-Seung Lee:

MLP-based phone boundary refining for a TTS database. 981-989 - Jerome R. Bellegarda:

A global, boundary-centric framework for unit selection text-to-speech synthesis. 990-997 - Cheng-Han Yang, Hsueh-Ming Hang:

Cascaded trellis-based rate-distortion control algorithm for MPEG-4 advanced audio coding. 998-1007 - Ben Supper, Tim Brookes, Francis Rumsey:

An auditory onset detection algorithm for improved automatic source localization. 1008-1017 - Woon-Seng Gan

, Jun Yang
, Khim Sia Tan, Meng Hwa Er:
A digital beamsteerer for difference frequency in a parametric array. 1018-1025 - Rui Cai, Lie Lu

, Alan Hanjalic, HongJiang Zhang, Lian-Hong Cai:
A flexible framework for key audio effects detection and auditory context inference. 1026-1039 - Dimitrios K. Fragoulis, Constantin Papaodysseus

, Mihalis Exarhos, George Roussopoulos, Thanasis Panagopoulos, Dimitrios Kamarotos:
Automated classification of piano-guitar notes. 1040-1050 - Harald Viste, Gianpaolo Evangelista

:
A method for separation of overlapping partials based on similarity of temporal envelopes in multichannel mixtures. 1051-1061 - Serkan Kiranyaz

, Ahmad Farooq Qureshi, Moncef Gabbouj
:
A generic audio classification and segmentation approach for multimedia indexing and retrieval. 1062-1081 - Timothy J. Hazen:

Visual model structures and synchrony constraints for audio-visual speech recognition. 1082-1089
Volume 14, Number 4, July 2006
- John F. Pitrelli, Raimo Bakis, Ellen Eide, Raul Fernandez, Wael Hamza, Michael A. Picheny:

The IBM expressive text-to-speech synthesis system for American English. 1099-1108 - Chung-Hsien Wu

, Chi-Chun Hsia, Te-Hsien Liu, Jhing-Fa Wang:
Voice conversion using duration-embedded bi-HMMs for expressive speech synthesis. 1109-1116 - Marc Schröder:

Expressing degree of activation in synthetic speech. 1128-1136 - Mariët Theune

, K. Meijs, Dirk Heylen, Roeland Ordelman
:
Generating expressive speech for storytelling applications. 1137-1144 - Jianhua Tao, Yongguo Kang, Aijun Li:

Prosody conversion from neutral speech to emotional speech. 1145-1154 - Wentao Gu

, Keikichi Hirose, Hiroya Fujisaki:
Modeling the effects of emphasis and question on fundamental frequency contours of Cantonese utterances. 1155-1170 - N. Campbell:

Conversational speech synthesis and the need for some laughter. 1171-1178 - Taishih Chi, Shihab A. Shamma:

Spectrum restoration from multiscale auditory phase singularities by generalized projections. 1179-1192 - Akira Watanabe, Tadashi Sakata:

Reliable methods for estimating relative vocal tract lengths from formant trajectories of common words. 1193-1204 - W. C. Chu:

Embedded quantization of line spectral frequencies using a multistage tree-structured vector quantizer. 1205-1217 - Jingdong Chen, Jacob Benesty

, Yiteng Arden Huang, Simon Doclo
:
New insights into the noise reduction Wiener filter. 1218-1234 - Yunxin Zhao, Rong Hu, Xiaolong Li:

Speedup convergence and reduce noise for enhanced speech separation and recognition. 1235-1244 - Jen-Tzung Chien

, Bo-Cheng Chen:
A new independent component analysis for speech recognition and separation. 1245-1254 - Satya Dharanipragada, Karthik Visweswariah:

Gaussian mixture models with covariances or precisions in shared multiple subspaces. 1255-1266 - Brian Kan-Wing Mak

, Roger Wend-Huu Hsiao, Simon Ka-Lung Ho, James T. Kwok:
Embedded kernel eigenvoice speaker adaptation and its implication to reference speaker weighting. 1267-1280 - Diamantino Caseiro, Isabel Trancoso

:
A specialized on-the-fly algorithm for lexicon and language model composition. 1281-1291 - Toshihiko Abe, Masaaki Honda:

Sinusoidal model based on instantaneous frequency attractors. 1292-1300 - Hui Ye, Steve J. Young:

Quality-enhanced voice morphing using maximum likelihood transformations. 1301-1312 - Ashish Aggarwal, Shankar L. Regunathan, Kenneth Rose:

Efficient bit-rate scalability for weighted squared error optimization in audio coding. 1313-1327 - Olivier Derrien

, Pierre Duhamel
, Maurice Charbit, Gaël Richard:
A new quantization optimization algorithm for the MPEG advanced audio coder using a statistical subband model of the quantization noise. 1328-1339 - Mads Græsbøll Christensen

, Steven van de Par:
Efficient parametric coding of transients. 1340-1351 - Rongshan Yu, Susanto Rahardja

, Xiao Lin, Chi Chung Ko:
A fine granular scalable to lossless audio coder. 1352-1363 - T. Umayahara, Haruhide Hokari, Shoji Shimada:

Stereo width control using interpolation and extrapolation of time-frequency representation. 1364-1377 - Fotios Talantzis, Darren B. Ward, Patrick A. Naylor

:
Performance analysis of dynamic acoustic source separation in reverberant rooms. 1378-1390 - Paulo A. A. Esquef

, Luiz W. P. Biscainho
:
An efficient model-based multirate method for reconstruction of audio signals across long gaps. 1391-1400 - Slim Essid, Gaël Richard, Bertrand David:

Musical instrument recognition by pairwise classification strategies. 1401-1412 - Ixone Arroabarren, Xavier Rodet, Alfonso Carlosena

:
On the measurement of the instantaneous frequency and amplitude of partials in vocal vibrato. 1413-1421 - Ixone Arroabarren, Alfonso Carlosena

:
Inverse filtering in singing voice: a critical analysis. 1422-1431 - Saman S. Abeysekera, Kabi Prakash Padhi:

An investigation of window effects on the frequency estimation using the phase vocoder. 1432-1439 - Axel Röbel:

Adaptive additive modeling with continuous parameter trajectories. 1440-1453 - Crispin H. V. Cooper, Damian T. Murphy, David M. Howard

, Alexander Tyrrell:
Singing synthesis with an evolved physical model. 1454-1461 - Emmanuel Vincent, Rémi Gribonval, Cédric Févotte:

Performance measurement in blind audio source separation. 1462-1469 - Panayiotis G. Georgiou

, Chris Kyriakakis:
Robust maximum likelihood source localization: the case for sub-Gaussian versus Gaussian. 1470-1480
Volume 14, Number 5, September 2006
- Li Deng, Dong Yu, Alex Acero

:
Structured speech modeling. 1492-1504 - Claude Barras, Xuan Zhu, Sylvain Meignier, Jean-Luc Gauvain:

Multistage speaker diarization of broadcast news. 1505-1512 - Mark J. F. Gales, Do Yeong Kim, Philip C. Woodland, Ho Yin Chan, David Mrva, Rohit Sinha

, S. E. Tranter:
Progress in the CU-HTK broadcast news transcription system. 1513-1525 - Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Dustin Hillard, Mari Ostendorf, Mary P. Harper:

Enriching speech recognition with automatic detection of sentence boundaries and disfluencies. 1526-1540 - Spyridon Matsoukas, Jean-Luc Gauvain, Gilles Adda, Thomas Colthurst, Chia-Lin Kao, Owen Kimball, Lori Lamel, Fabrice Lefèvre, Jeff Z. Ma, John Makhoul, Long Nguyen, Rohit Prasad, Richard M. Schwartz, Holger Schwenk, Bing Xiang:

Advances in transcription of broadcast news and conversational telephone speech within the combined EARS BBN/LIMSI system. 1541-1556 - S. E. Tranter, Douglas A. Reynolds:

An overview of automatic speaker diarization systems. 1557-1565 - Matthew Lease, Mark Johnson

, Eugene Charniak:
Recognizing disfluencies in conversational speech. 1566-1573 - Jui-Feng Yeh, Chung-Hsien Wu

:
Edit disfluency detection and correction using a cleanup language model and an alignment model. 1574-1583 - Hui Jiang, Xinwei Li, Chaojun Liu:

Large margin hidden Markov models for speech recognition. 1584-1595 - Stanley F. Chen

, Brian Kingsbury, Lidia Mangu, Daniel Povey, George Saon
, Hagen Soltau, Geoffrey Zweig:
Advances in speech transcription at IBM under the DARPA EARS program. 1596-1608 - Christoffer Asgaard Rødbro, Manohar N. Murthi, Søren Vang Andersen, Søren Holdt Jensen:

Hidden Markov model-based packet loss concealment for voice over IP. 1609-1623 - Farshad Lahouti, Ahmad R. Fazel, A. H. Safavi-Naeini, Amir K. Khandani:

Single and double frame coding of speech LPC parameters using a lattice-based quantization scheme. 1624-1632 - Herbert Buchner, Jacob Benesty

, Tomas Gänsler, Walter Kellermann:
Robust extended multidelay filter and double-talk detector for acoustic echo cancellation. 1633-1644 - Marcin Kuropatwinski

, W. Bastiaan Kleijn
:
Estimation of the short-term predictor parameters of speech under noisy conditions. 1645-1655 - Rile Hu, Chengqing Zong

, Bo Xu:
An approach to automatic acquisition of translation templates based on phrase structure extraction and alignment. 1656-1663 - Alon Lavie, Fabio Pianesi, Lori S. Levin

:
The NESPOLE! System for multilingual speech communication over the Internet. 1664-1673 - Gen-ichiro Kikui, Seiichi Yamamoto, Toshiyuki Takezawa, Eiichiro Sumita:

Comparative study on corpora for speech translation. 1674-1682 - Xiao Li, Jonathan Malkin, Jeff A. Bilmes:

A high-speed, low-resource ASR back-end based on custom arithmetic. 1683-1693 - Kai Yu, Mark J. F. Gales:

Discriminative cluster adaptive training. 1694-1703 - Chak-Fai Li, Man-Hung Siu, Jeff Siu-Kei Au-Yeung

:
Recursive likelihood evaluation and fast search algorithm for polynomial segment model with application to speech recognition. 1704-1718 - Jen-Tzung Chien

:
Association pattern language modeling. 1719-1728 - Andreas Stolcke, Barry Y. Chen, Horacio Franco, Venkata Ramana Rao Gadde, Martin Graciarena, Mei-Yuh Hwang, Katrin Kirchhoff, Arindam Mandal, Nelson Morgan, Xin Lei, Tim Ng, Mari Ostendorf, M. Kemal Sönmez, Anand Venkataraman, Dimitra Vergyri, Wen Wang, Jing Zheng, Qifeng Zhu:

Recent innovations in speech-to-text transcription at SRI-ICSI-UW. 1729-1744 - Nicolae Duta, Richard M. Schwartz, John Makhoul:

Analysis of the errors produced by the 2004 BBN speech recognition system in the DARPA EARS evaluations. 1745-1753 - Siddharth Mathur, Brad H. Story, Jeffrey J. Rodríguez:

Vocal-tract modeling: fractional elongation of segment lengths in a waveguide model with half-sample delays. 1754-1762 - Jithendra Vepa, Simon King

:
Subjective evaluation of join cost and smoothing methods for unit selection speech synthesis. 1763-1771 - Cléo Baras, Nicolas Moreau, Przemyslaw Dymarski

:
Controlling the inaudibility and maximizing the robustness in an audio annotation watermarking system. 1772-1782 - Masataka Goto

:
A chorus section detection method for musical audio signals and its application to a music listening station. 1783-1794 - Aggelos Pikrakis

, Sergios Theodoridis, Dimitris Kamarotos:
Classification of musical patterns using variable duration hidden Markov models. 1795-1807 - Laurent Daudet:

Sparse and structured decompositions of signals with the molecular matching pursuit. 1808-1816 - Vincent Verfaille, Udo Zölzer, Daniel Arfib:

Adaptive digital audio effects (a-DAFx): a new class of sound transformations. 1817-1831 - Fabien Gouyon

, Anssi Klapuri, Simon Dixon, M. Alonso, George Tzanetakis
, C. Uhle, Pedro Cano:
An experimental comparison of audio tempo induction algorithms. 1832-1844 - Mark R. Every, John E. Szymanski:

Separation of synchronous pitched notes by spectral filtering of harmonics. 1845-1856 - Sen M. Kuo, Ajay B. Puvvala:

Effects of frequency separation in periodic active noise control systems. 1857-1866 - Guangji Shi, Maryam Modir Shanechi, Parham Aarabi:

On the importance of phase in human speech recognition. 1867-1874 - Debi Prasad Das, Swagat Ranjan Mohapatra, Aurobinda Routray

, Tapan Kumar Basu
:
Filtered-s LMS algorithm for multichannel active control of nonlinear noise processes. 1875-1880
Volume 14, Number 6, November 2006
- Antony W. Rix

, John G. Beerends, Doh-Suk Kim, Peter Kroon, Oded Ghitza:
Objective Assessment of Speech and Audio Quality - Technology and Applications. 1890-1901 - Rainer Huber

, Birger Kollmeier:
PEMO-Q - A New Method for Objective Audio Quality Assessment Using a Model of Auditory Perception. 1902-1911 - Abhijit Karmakar

, Arun Kumar, R. K. Patney:
A Multiresolution Model of Auditory Excitation Pattern and Its Application to Objective Evaluation of Perceived Speech Quality. 1912-1923 - Ludovic Malfait, Jens Berger, Martin Kastner:

P.563 - The ITU-T Standard for Single-Ended Speech Quality Assessment. 1924-1934 - Tiago H. Falk

, Wai-Yip Chan:
Single-Ended Speech Quality Measurement Using Machine Learning Methods. 1935-1947 - Volodya Grancharov, David Yuheng Zhao, Jonas Lindblom, W. Bastiaan Kleijn

:
Low-Complexity, Nonintrusive Speech Quality Assessment. 1948-1956 - Alexander Raake

:
Short- and Long-Term Packet Loss Behavior: Towards Speech Quality Prediction for Arbitrary Loss Distributions. 1957-1968 - Sebastian Möller, Alexander Raake

, Nobuhiko Kitawaki, Akira Takahashi, Marcel Wältermann:
Impairment Factor Framework for Wide-Band Speech Codecs. 1969-1976 - S. R. Broom:

VoIP Quality Assessment: Taking Account of the Edge-Device. 1977-1983 - Akira Takahashi, Atsuko Kurashima, Hideaki Yoshino:

Objective Assessment Methodology for Estimating Conversational Quality in VoIP. 1984-1993 - Sunish George, Slawomir K. Zielinski

, Francis Rumsey:
Feature Extraction for the Prediction of Multichannel Spatial Audio Fidelity. 1994-2005 - Takeshi Yamada, Masakazu Kumakura, Nobuhiko Kitawaki:

Performance Estimation of Speech Recognition System Under Noise Conditions Using Objective Quality Measures and Artificial Voice. 2006-2013 - Peng Li, Yong Guan, Bo Xu, Wenju Liu:

Monaural Speech Separation Based on Computational Auditory Scene Analysis and Objective Quality Assessment of Speech. 2014-2023 - Georgios Evangelopoulos

, Petros Maragos:
Multiband Modulation Energy Tracking for Noisy Speech Detection. 2024-2038 - Mohamed A. Deriche

, Daryl Ning:
A Novel Audio Coding Scheme Using Warped Linear Prediction Model and the Discrete Wavelet Transform. 2039-2048 - John H. L. Hansen, V. Radhakrishnan, Kathryn Hoberg Arehart:

Speech Enhancement Based on Generalized Minimum Mean Square Error Estimators and Masking Properties of the Auditory System. 2049-2063 - Richard C. Hendriks, Richard Heusdens, Jesper Jensen:

Adaptive Time Segmentation for Improved Speech Enhancement. 2064-2074 - Tiemin Mei

, Jiangtao Xi
, Fuliang Yin, Alfred Mertins, Joe F. Chicharo:
Blind Source Separation Based on Time-Domain Optimization of a Frequency-Domain Independence Criterion. 2075-2085 - Yoshifumi Nagata, K. Mitsubori, T. Kagi, Toyota Fujioka, Masato Abe:

Fast Implementation of KLT-Based Speech Enhancement Using Vector Quantization. 2086-2097 - Cyril Plapous, Claude Marro, Pascal Scalart:

Improved Signal-to-Noise Ratio Estimation for Speech Enhancement. 2098-2108 - Michael L. Seltzer, Richard M. Stern

:
Subband Likelihood-Maximizing Beamforming for Speech Recognition in Reverberant Environments. 2109-2121 - Man-Hung Siu, Arthur Chan:

A Robust Viterbi Algorithm Against Impulsive Noise With Application to Speech Recognition. 2122-2133 - Ye Tian, Jian-Lai Zhou, Hui Lin, Hui Jiang:

Tree-Based Covariance Modeling of Hidden Markov Models. 2134-2146 - Jian Wu, Qiang Huo:

An Environment-Compensated Minimum Classification Error Training Approach Based on Stochastic Vector Mapping. 2147-2155 - Kevin W. Wilson, Trevor Darrell:

Learning a Precedence Effect-Like Weighting Function for the Generalized Cross-Correlation Framework. 2156-2164 - Hiroshi Sawada, Shoko Araki

, Ryo Mukai, Shoji Makino
:
Blind Extraction of Dominant Target Sources Using ICA and Time-Frequency Masking. 2165-2173 - Cédric Févotte, Simon J. Godsill:

A Bayesian Approach for Blind Separation of Sparse Sources. 2174-2188 - Yegui Xiao, Liying Ma, Khashayar Khorasani, Akira Ikuta:

A New Robust Narrowband Active Noise Control System in the Presence of Frequency Mismatch. 2189-2200 - Yoshikazu Yokotani, Ralf Geiger, G. D. T. Schuller, Soontorn Oraintara, K. R. Rao

:
Lossless Audio Coding Using the IntMDCT and Rounding Error Shaping. 2201-2211 - Toshio Irino, Roy D. Patterson, Hideki Kawahara

:
Speech Segregation Using an Auditory Vocoder With Event-Synchronous Enhancements. 2212-2221 - Toshio Irino, Roy D. Patterson:

A Dynamic Compressive Gammachirp Auditory Filterbank. 2222-2232 - Wei-Chen Chang, Alvin Wen-Yu Su:

A Multichannel Recurrent Network Analysis/Synthesis Model for Coupled-String Instruments. 2233-2241 - Juan Pablo Bello

, Laurent Daudet, Mark B. Sandler:
Automatic Piano Transcription Using Frequency and Time-Domain Information. 2242-2251 - Panu Somervuo, Aki Härmä

, Seppo Fagerlund:
Parametric Representations of Bird Sounds for Automatic Species Recognition. 2252-2263 - Ying Li, Chitra Dorai:

Instructional Video Content Analysis Using Audio Information. 2264-2274

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














