default search action
Tuomas Virtanen
Person information
- affiliation: Tampere University of Technology, Finland
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j46]Laura Hekanaho, Maija Hirvonen, Tuomas Virtanen:
Language-based machine perception: linguistic perspectives on the compilation of captioning datasets. Digit. Scholarsh. Humanit. 39(3): 864-883 (2024) - [j45]Szymon Drgas, Lars Bramsløw, Archontis Politis, Gaurav Naithani, Tuomas Virtanen:
Dynamic Processing Neural Network Architecture for Hearing Loss Compensation. IEEE ACM Trans. Audio Speech Lang. Process. 32: 203-214 (2024) - [j44]Michael Neri, Archontis Politis, Daniel Aleksander Krause, Marco Carli, Tuomas Virtanen:
Speaker Distance Estimation in Enclosures From Single-Channel Audio. IEEE ACM Trans. Audio Speech Lang. Process. 32: 2242-2254 (2024) - [c177]Mikko Heikkinen, Archontis Politis, Tuomas Virtanen:
Neural Ambisonics Encoding For Compact Irregular Microphone Arrays. ICASSP 2024: 701-705 - [c176]Yuzhu Wang, Archontis Politis, Tuomas Virtanen:
Attention-Driven Multichannel Speech Enhancement in Moving Sound Source Scenarios. ICASSP 2024: 11221-11225 - [i80]Mikko Heikkinen, Archontis Politis, Tuomas Virtanen:
Neural Ambisonics encoding for compact irregular microphone arrays. CoRR abs/2401.05916 (2024) - [i79]John Martinsson, Olof Mogren, Maria Sandsten, Tuomas Virtanen:
From Weak to Strong Sound Event Labels using Adaptive Change-Point Detection and Active Learning. CoRR abs/2403.08525 (2024) - [i78]Michael Neri, Archontis Politis, Daniel Krause, Marco Carli, Tuomas Virtanen:
Speaker Distance Estimation in Enclosures from Single-Channel Audio. CoRR abs/2403.17514 (2024) - [i77]Andreas Triantafyllopoulos, Iosif Tsangko, Alexander Gebhard, Annamaria Mesaros, Tuomas Virtanen, Björn W. Schuller:
Computer Audition: From Task-Specific Machine Learning to Foundation Models. CoRR abs/2407.15672 (2024) - [i76]Martin Moritz, Toni Olán, Tuomas Virtanen:
Noise-to-mask Ratio Loss for Deep Neural Network based Audio Watermarking. CoRR abs/2408.15553 (2024) - 2023
- [c175]Paul Magron, Tuomas Virtanen:
Spectrogram Inversion for Audio Source Separation via Consistency, Mixing, and Magnitude Constraints. EUSIPCO 2023: 36-40 - [c174]David Diaz-Guerra, Archontis Politis, Tuomas Virtanen:
Position Tracking of a Varying Number of Sound Sources with Sliding Permutation Invariant Training. EUSIPCO 2023: 251-255 - [c173]Khazar Khorrami, María Andrea Cruz Blandón, Tuomas Virtanen, Okko Räsänen:
Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System. EUSIPCO 2023: 431-435 - [c172]Parthasaarathy Sudarsanam, Tuomas Virtanen:
Attention-Based Methods For Audio Question Answering. EUSIPCO 2023: 750-754 - [c171]Huang Xie, Okko Räsänen, Tuomas Virtanen:
On Negative Sampling for Contrastive Audio-Text Retrieval. ICASSP 2023: 1-5 - [c170]Wei Xie, Yanxiong Li, Qianhua He, Wenchang Cao, Tuomas Virtanen:
Few-shot Class-incremental Audio Classification Using Adaptively-refined Prototypes. INTERSPEECH 2023: 301-305 - [c169]Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Aleksander Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji:
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. NeurIPS 2023 - [c168]Diep Luong, Minh Tran, Shayan Gharib, Konstantinos Drossos, Tuomas Virtanen:
Representation Learning for Audio Privacy Preservation Using Source Separation and Robust Adversarial Learning. WASPAA 2023: 1-5 - [c167]Michael Neri, Archontis Politis, Daniel Krause, Marco Carli, Tuomas Virtanen:
Single-Channel Speaker Distance Estimation in Reverberant Environments. WASPAA 2023: 1-5 - [d18]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Aapo Hakala, Shusuke Takahashi, Daniel Aleksander Krause, Naoya Takahashi, Sharath Adavanne, Yuichiro Koyama, Kengo Uchida, Yuki Mitsufuji, Tuomas Virtanen:
STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023. Version 1.0.0. Zenodo, 2023 [all versions] - [d17]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Aapo Hakala, Shusuke Takahashi, Daniel Aleksander Krause, Naoya Takahashi, Sharath Adavanne, Yuichiro Koyama, Kengo Uchida, Yuki Mitsufuji, Tuomas Virtanen:
STARSS23: Sony-TAu Realistic Spatial Soundscapes 2023. Version 1.1.0. Zenodo, 2023 [all versions] - [i75]Paul Magron, Tuomas Virtanen:
Spectrogram Inversion for Audio Source Separation via Consistency, Mixing, and Magnitude Constraints. CoRR abs/2303.01864 (2023) - [i74]Wang Dai, Archontis Politis, Tuomas Virtanen:
Multi-Channel Masking with Learnable Filterbank for Sound Source Separation. CoRR abs/2303.07816 (2023) - [i73]Shayan Gharib, Minh Tran, Diep Luong, Konstantinos Drossos, Tuomas Virtanen:
Adversarial Representation Learning for Robust Privacy Preservation in Audio. CoRR abs/2305.00011 (2023) - [i72]Wei Xie, Yanxiong Li, Qianhua He, Wenchang Cao, Tuomas Virtanen:
Few-shot Class-incremental Audio Classification Using Adaptively-refined Prototypes. CoRR abs/2305.18045 (2023) - [i71]Parthasaarathy Sudarsanam, Tuomas Virtanen:
Attention-Based Methods For Audio Question Answering. CoRR abs/2305.19769 (2023) - [i70]Khazar Khorrami, María Andrea Cruz Blandón, Tuomas Virtanen, Okko Räsänen:
Simultaneous or Sequential Training? How Speech Representations Cooperate in a Multi-Task Self-Supervised Learning System. CoRR abs/2306.02972 (2023) - [i69]David Diaz-Guerra, Archontis Politis, Antonio Miguel, José Ramón Beltrán, Tuomas Virtanen:
Permutation Invariant Recurrent Neural Networks for Sound Source Tracking Applications. CoRR abs/2306.08510 (2023) - [i68]Kazuki Shimada, Archontis Politis, Parthasaarathy Sudarsanam, Daniel Krause, Kengo Uchida, Sharath Adavanne, Aapo Hakala, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen, Yuki Mitsufuji:
STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. CoRR abs/2306.09126 (2023) - [i67]Huang Xie, Khazar Khorrami, Okko Räsänen, Tuomas Virtanen:
Crowdsourcing and Evaluating Text-Based Audio Retrieval Relevances. CoRR abs/2306.09820 (2023) - [i66]Diep Luong, Minh Tran, Shayan Gharib, Konstantinos Drossos, Tuomas Virtanen:
Representation Learning for Audio Privacy Preservation using Source Separation and Robust Adversarial Learning. CoRR abs/2308.04960 (2023) - [i65]Szymon Drgas, Lars Bramsløw, Archontis Politis, Gaurav Naithani, Tuomas Virtanen:
Dynamic Processing Neural Network Architecture For Hearing Loss Compensation. CoRR abs/2310.16550 (2023) - [i64]Yuzhu Wang, Archontis Politis, Tuomas Virtanen:
Attention-Driven Multichannel Speech Enhancement in Moving Sound Source Scenarios. CoRR abs/2312.10756 (2023) - 2022
- [j43]Björn W. Schuller, Yonina C. Eldar, Maja Pantic, Shrikanth Narayanan, Tuomas Virtanen, Jianhua Tao:
Editorial: Intelligent Signal Analysis for Contagious Virus Diseases. IEEE J. Sel. Top. Signal Process. 16(2): 159-163 (2022) - [j42]Shanshan Wang, Archontis Politis, Annamaria Mesaros, Tuomas Virtanen:
Self-Supervised Learning of Audio Representations From Audio-Visual Data Using Spatial Alignment. IEEE J. Sel. Top. Signal Process. 16(6): 1467-1479 (2022) - [c166]Irene Martín-Morató, Francesco Paissan, Alberto Ancilotto, Toni Heittola, Annamaria Mesaros, Elisabetta Farella, Alessio Brutti, Tuomas Virtanen:
Low-Complexity Acoustic Scene Classification in DCASE 2022 Challenge. DCASE 2022 - [c165]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen:
STARSS22: A Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events. DCASE 2022 - [c164]Huang Xie, Samuel Lipping, Tuomas Virtanen:
Language-Based Audio Retrieval Task in DCASE 2022 Challenge. DCASE 2022 - [c163]Duygu Dogan, Huang Xie, Toni Heittola, Tuomas Virtanen:
Zero-Shot Audio Classification using Image Embeddings. EUSIPCO 2022: 1-5 - [c162]Ville-Veikko Eklund, Aleksandr Diment, Tuomas Virtanen:
Noise, Device and Room Robustness Methods for Pronunciation Error Detection. EUSIPCO 2022: 140-144 - [c161]Samuel Lipping, Parthasaarathy Sudarsanam, Konstantinos Drossos, Tuomas Virtanen:
Clotho-AQA: A Crowdsourced Dataset for Audio Question Answering. EUSIPCO 2022: 1140-1144 - [c160]Huang Xie, Okko Räsänen, Konstantinos Drossos, Tuomas Virtanen:
Unsupervised Audio-Caption Aligning Learns Correspondences Between Individual Sound Events and Textual Phrases. ICASSP 2022: 8867-8871 - [c159]Yanxiong Li, Wenchang Cao, Konstantinos Drossos, Tuomas Virtanen:
Domestic Activity Clustering from Audio via Depthwise Separable Convolutional Autoencoder Network. MMSP 2022: 1-6 - [c158]Gaurav Naithani, Kirsi Pietilä, Riitta Niemistö, Erkki Paajanen, Tero Takala, Tuomas Virtanen:
Subjective Evaluation of Deep Neural Network Based Speech Enhancement Systems in Real-World Conditions. MMSP 2022: 1-6 - [d16]Samuel Lipping, Parthasaarathy Sudarsanam, Konstantinos Drossos, Tuomas Virtanen:
Clotho-AQA dataset. Zenodo, 2022 - [d15]Archontis Politis, Sharath Adavanne, Tuomas Virtanen:
TAU Spatial Room Impulse Response Database (TAU-SRIR DB). Zenodo, 2022 - [d14]Adavanne Politis, Yuki Mitsufuji, Parthasaarathy Sudarsanam, Kazuki Shimada, Sharath Adavanne, Yuichiro Koyama, Daniel Krause, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen:
STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset. Version 1.0.0. Zenodo, 2022 [all versions] - [d13]Archontis Politis, Yuki Mitsufuji, Parthasaarathy Sudarsanam, Kazuki Shimada, Sharath Adavanne, Yuichiro Koyama, Daniel Aleksander Krause, Naoya Takahashi, Shusuke Takahashi, Tuomas Virtanen:
STARSS22: Sony-TAu Realistic Spatial Soundscapes 2022 dataset. Version 1.1.0. Zenodo, 2022 [all versions] - [i63]Samuel Lipping, Parthasaarathy Sudarsanam, Konstantinos Drossos, Tuomas Virtanen:
Clotho-AQA: A Crowdsourced Dataset for Audio Question Answering. CoRR abs/2204.09634 (2022) - [i62]Shanshan Wang, Archontis Politis, Annamaria Mesaros, Tuomas Virtanen:
Self-supervised Learning of Audio Representations from Audio-Visual Data using Spatial Alignment. CoRR abs/2206.00970 (2022) - [i61]Archontis Politis, Kazuki Shimada, Parthasaarathy Sudarsanam, Sharath Adavanne, Daniel Krause, Yuichiro Koyama, Naoya Takahashi, Shusuke Takahashi, Yuki Mitsufuji, Tuomas Virtanen:
STARSS22: A dataset of spatial recordings of real scenes with spatiotemporal annotations of sound events. CoRR abs/2206.01948 (2022) - [i60]Duygu Dogan, Huang Xie, Toni Heittola, Tuomas Virtanen:
Zero-Shot Audio Classification using Image Embeddings. CoRR abs/2206.04984 (2022) - [i59]Yanxiong Li, Wenchang Cao, Konstantinos Drossos, Tuomas Virtanen:
Domestic Activity Clustering from Audio via Depthwise Separable Convolutional Autoencoder Network. CoRR abs/2208.02406 (2022) - [i58]Gaurav Naithani, Kirsi Pietilä, Riitta Niemistö, Erkki Paajanen, Tero Takala, Tuomas Virtanen:
Subjective Evaluation of Deep Neural Network Based Speech Enhancement Systems in Real-World Conditions. CoRR abs/2208.05057 (2022) - [i57]David Diaz-Guerra, Archontis Politis, Tuomas Virtanen:
Position tracking of a varying number of sound sources with sliding permutation invariant training. CoRR abs/2210.14536 (2022) - [i56]Huang Xie, Okko Räsänen, Tuomas Virtanen:
On Negative Sampling for Contrastive Audio-Text Retrieval. CoRR abs/2211.04070 (2022) - 2021
- [j41]Szymon Drgas, Tuomas Virtanen:
Joint speaker separation and recognition using non-negative matrix deconvolution with adaptive dictionary. Comput. Speech Lang. 70: 101223 (2021) - [j40]Annamaria Mesaros, Toni Heittola, Tuomas Virtanen, Mark D. Plumbley:
Sound Event Detection: A tutorial. IEEE Signal Process. Mag. 38(5): 67-83 (2021) - [j39]Archontis Politis, Annamaria Mesaros, Sharath Adavanne, Toni Heittola, Tuomas Virtanen:
Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019. IEEE ACM Trans. Audio Speech Lang. Process. 29: 684-698 (2021) - [j38]Huang Xie, Tuomas Virtanen:
Zero-Shot Audio Classification Via Semantic Embeddings. IEEE ACM Trans. Audio Speech Lang. Process. 29: 1233-1242 (2021) - [c157]Shanshan Wang, Annamaria Mesaros, Toni Heittola, Tuomas Virtanen:
Audio-Visual Scene Classification: Analysis of DCASE 2021 Challenge Submissions. DCASE 2021: 45-49 - [c156]Irene Martín-Morató, Toni Heittola, Annamaria Mesaros, Tuomas Virtanen:
Low-Complexity Acoustic Scene Classification for Multi-Device Audio: Analysis of DCASE 2021 Challenge Systems. DCASE 2021: 85-89 - [c155]Archontis Politis, Sharath Adavanne, Daniel Krause, Antoine Deleforge, Prerak Srivastava, Tuomas Virtanen:
A Dataset of Dynamic Reverberant Sound Scenes with Directional Interferers for Sound Event Localization and Detection. DCASE 2021: 125-129 - [c154]Shanshan Wang, Gaurav Naithani, Archontis Politis, Tuomas Virtanen:
Deep Neural Network Based Low-Latency Speech Separation with Asymmetric Analysis-Synthesis Window Pair. EUSIPCO 2021: 301-305 - [c153]Pasi Pertilä, Emre Cakir, Aapo Hakala, Eemi Fagerlund, Tuomas Virtanen, Archontis Politis, Antti J. Eronen:
Mobile Microphone Array Speech Detection and Localization in Diverse Everyday Environments. EUSIPCO 2021: 406-410 - [c152]Slobodan Djukanovic, Yash Patel, Jirí Matas, Tuomas Virtanen:
Neural network-based acoustic vehicle counting. EUSIPCO 2021: 561-565 - [c151]An Tran, Konstantinos Drossos, Tuomas Virtanen:
WaveTransformer: An Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information. EUSIPCO 2021: 576-580 - [c150]Huang Xie, Okko Räsänen, Tuomas Virtanen:
Zero-Shot Audio Classification with Factored Linear and Nonlinear Acoustic-Semantic Projections. ICASSP 2021: 326-330 - [c149]Xavier Favory, Konstantinos Drossos, Tuomas Virtanen, Xavier Serra:
Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio and Tags. ICASSP 2021: 596-600 - [c148]Shanshan Wang, Annamaria Mesaros, Toni Heittola, Tuomas Virtanen:
A Curated Dataset of Urban Scenes for Audio-Visual Scene Analysis. ICASSP 2021: 626-630 - [c147]Björn W. Schuller, Tuomas Virtanen, Maria Riveiro, Georgios Rizos, Jing Han, Annamaria Mesaros, Konstantinos Drossos:
Towards Sonification in Multimodal and User-friendlyExplainable Artificial Intelligence. ICMI 2021: 788-792 - [c146]Sharath Adavanne, Archontis Politis, Tuomas Virtanen:
Differentiable Tracking-Based Training of Deep Learning Sound Source Localizers. WASPAA 2021: 211-215 - [d12]Konstantinos Drossos, Samuel Lipping, Tuomas Virtanen:
Clotho dataset. Version 2.0. Zenodo, 2021 [all versions] - [d11]Konstantinos Drossos, Samuel Lipping, Tuomas Virtanen:
Clotho dataset. Version 2.1. Zenodo, 2021 [all versions] - [d10]Archontis Politis, Sharath Adavanne, Tuomas Virtanen:
TAU-NIGENS Spatial Sound Events 2021. Version 1. Zenodo, 2021 [all versions] - [d9]Archontis Politis, Sharath Adavanne, Tuomas Virtanen:
TAU-NIGENS Spatial Sound Events 2021. Version 1.1.0. Zenodo, 2021 [all versions] - [d8]Archontis Politis, Sharath Adavanne, Tuomas Virtanen:
TAU-NIGENS Spatial Sound Events 2021. Version 1.2.0. Zenodo, 2021 [all versions] - [i55]Shanshan Wang, Toni Heittola, Annamaria Mesaros, Tuomas Virtanen:
Audio-visual scene classification: analysis of DCASE 2021 Challenge submissions. CoRR abs/2105.13675 (2021) - [i54]Archontis Politis, Sharath Adavanne, Daniel Krause, Antoine Deleforge, Prerak Srivastava, Tuomas Virtanen:
A Dataset of Dynamic Reverberant Sound Scenes with Directional Interferers for Sound Event Localization and Detection. CoRR abs/2106.06999 (2021) - [i53]Shanshan Wang, Gaurav Naithani, Archontis Politis, Tuomas Virtanen:
Deep neural network Based Low-latency Speech Separation with Asymmetric analysis-Synthesis Window Pair. CoRR abs/2106.11794 (2021) - [i52]Sharath Adavanne, Archontis Politis, Tuomas Virtanen:
Differentiable Tracking-Based Training of Deep Learning Sound Source Localizers. CoRR abs/2111.00030 (2021) - 2020
- [j37]Paul Magron, Tuomas Virtanen:
Online Spectrogram Inversion for Low-Latency Audio Source Separation. IEEE Signal Process. Lett. 27: 306-310 (2020) - [j36]Shuyang Zhao, Toni Heittola, Tuomas Virtanen:
Active Learning for Sound Event Detection. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2895-2905 (2020) - [c145]Emre Çakir, Konstantinos Drossos, Tuomas Virtanen:
Multi-Task Regularization Based on Infrequent Classes for Audio Captioning. DCASE 2020: 6-10 - [c144]Toni Heittola, Annamaria Mesaros, Tuomas Virtanen:
Acoustic Scene Classification in DCASE 2020 Challenge: Generalization Across Devices and Low Complexity Solutions. DCASE 2020: 56-60 - [c143]Khoa Nguyen, Konstantinos Drossos, Tuomas Virtanen:
Temporal Sub-Sampling of Audio Feature Sequences for Automated Audio Captioning. DCASE 2020: 110-114 - [c142]Archontis Politis, Sharath Adavanne, Tuomas Virtanen:
A Dataset of Reverberant Spatial Sound Scenes with Moving Sources for Sound Event Localization and Detection. DCASE 2020: 165-169 - [c141]Niccolò Nicodemo, Gaurav Naithani, Konstantinos Drossos, Tuomas Virtanen, Roberto Saletti:
Memory Requirement Reduction of Deep Neural Networks for Field Programmable Gate Arrays Using Low-Bit Quantization of Parameters. EUSIPCO 2020: 466-470 - [c140]Yanxiong Li, Mingle Liu, Konstantinos Drossos, Tuomas Virtanen:
Sound Event Detection Via Dilated Convolutional Recurrent Neural Networks. ICASSP 2020: 286-290 - [c139]Konstantinos Drossos, Samuel Lipping, Tuomas Virtanen:
Clotho: an Audio Captioning Dataset. ICASSP 2020: 736-740 - [c138]Konstantinos Drossos, Stylianos I. Mimilakis, Shayan Gharib, Yanxiong Li, Tuomas Virtanen:
Sound Event Detection with Depthwise Separable and Dilated Convolutions. IJCNN 2020: 1-7 - [c137]Slobodan Djukanovic, Jiri Matas, Tuomas Virtanen:
Robust Audio-Based Vehicle Counting in Low-to-Moderate Traffic Flow. IV 2020: 1608-1614 - [c136]Pyry Pyykkönen, Stylianos I. Mimilakis, Konstantinos Drossos, Tuomas Virtanen:
Depthwise Separable Convolutions Versus Recurrent Neural Networks for Monaural Singing Voice Separation. MMSP 2020: 1-6 - [d7]Konstantinos Drossos, Samuel Lipping, Tuomas Virtanen:
Audio captioning DCASE 2020 evaluation (testing) split. Zenodo, 2020 - [d6]Xavier Favory, Konstantinos Drossos, Tuomas Virtanen, Xavier Serra:
Dataset used in COALA: Co-Aligned Autoencoders for Learning Semantically Enriched Audio Representations. Zenodo, 2020 - [d5]Shayan Gharib, Konstantinos Drossos, Eemi Fagerlund, Tuomas Virtanen:
VOICe Dataset. Zenodo, 2020 - [i51]Konstantinos Drossos, Stylianos Ioannis Mimilakis, Shayan Gharib, Yanxiong Li, Tuomas Virtanen:
Sound Event Detection with Depthwise Separable and Dilated Convolutions. CoRR abs/2002.00476 (2020) - [i50]Shuyang Zhao, Toni Heittola, Tuomas Virtanen:
Active Learning for Sound Event Detection. CoRR abs/2002.05033 (2020) - [i49]Archontis Politis, Sharath Adavanne, Tuomas Virtanen:
A Dataset of Reverberant Spatial Sound Scenes with Moving Sources for Sound Event Localization and Detection. CoRR abs/2006.01919 (2020) - [i48]Xavier Favory, Konstantinos Drossos, Tuomas Virtanen, Xavier Serra:
COALA: Co-Aligned Autoencoders for Learning Semantically Enriched Audio Representations. CoRR abs/2006.08386 (2020) - [i47]Khoa Nguyen, Konstantinos Drossos, Tuomas Virtanen:
Temporal Sub-sampling of Audio Feature Sequences for Automated Audio Captioning. CoRR abs/2007.02676 (2020) - [i46]Pyry Pyykkönen, Stylianos Ioannis Mimilakis, Konstantinos Drossos, Tuomas Virtanen:
Depthwise Separable Convolutions Versus Recurrent Neural Networks for Monaural Singing Voice Separation. CoRR abs/2007.02683 (2020) - [i45]Emre Çakir, Konstantinos Drossos, Tuomas Virtanen:
Multi-task Regularization Based on Infrequent Classes for Audio Captioning. CoRR abs/2007.04660 (2020) - [i44]Konstantinos Drossos, Stylianos Ioannis Mimilakis, Tuomas Virtanen:
Conditioned Time-Dilated Convolutions for Sound Event Detection. CoRR abs/2007.05183 (2020) - [i43]Archontis Politis, Annamaria Mesaros, Sharath Adavanne, Toni Heittola, Tuomas Virtanen:
Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019. CoRR abs/2009.02792 (2020) - [i42]An Tran, Konstantinos Drossos, Tuomas Virtanen:
WaveTransformer: A Novel Architecture for Audio Captioning Based on Learning Temporal and Time-Frequency Information. CoRR abs/2010.11098 (2020) - [i41]Slobodan Djukanovic, Yash Patel, Jiri Matas, Tuomas Virtanen:
Neural Network-based Acoustic Vehicle Counting. CoRR abs/2010.11659 (2020) - [i40]Slobodan Djukanovic, Jiri Matas, Tuomas Virtanen:
Robust Audio-Based Vehicle Counting in Low-to-Moderate Traffic Flow. CoRR abs/2010.11716 (2020) - [i39]Xavier Favory, Konstantinos Drossos, Tuomas Virtanen, Xavier Serra:
Learning Contextual Tag Embeddings for Cross-Modal Alignment of Audio and Tags. CoRR abs/2010.14171 (2020)
2010 – 2019
- 2019
- [j35]Víctor M. García-Molla, Pablo San Juan Sebastián, Tuomas Virtanen, Antonio M. Vidal, Pedro Alonso:
Generalization of the K-SVD algorithm for minimization of β-divergence. Digit. Signal Process. 92: 47-53 (2019) - [j34]Sharath Adavanne, Archontis Politis, Joonas Nikunen, Tuomas Virtanen:
Sound Event Localization and Detection of Overlapping Sources Using Convolutional Recurrent Neural Networks. IEEE J. Sel. Top. Signal Process. 13(1): 34-48 (2019) - [j33]Hendrik Purwins, Bo Li, Tuomas Virtanen, Jan Schlüter, Shuo-Yiin Chang, Tara N. Sainath:
Deep Learning for Audio Signal Processing. IEEE J. Sel. Top. Signal Process. 13(2): 206-219 (2019) - [j32]Paul Magron, Tuomas Virtanen:
Complex ISNMF: A Phase-Aware Model for Monaural Audio Source Separation. IEEE ACM Trans. Audio Speech Lang. Process. 27(1): 20-31 (2019) - <