default search action
APSIPA 2021: Tokyo, Japan
- Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2021, Tokyo, Japan, December 14-17, 2021. IEEE 2021, ISBN 978-988-14768-9-0
- Jiahong Zhao, Christian Ritz:
Coprime Microphone Arrays for Estimating Speech Direction of Arrival Using Deep Learning. 1-8 - Takayuki Sasaki, Ryuichi Tanida, Masaki Kitahara, Hideaki Kimata:
Fast-Parallel Singular Value Thresholding for Many Small Matrices based on Geometric Feature of Singular Values. 1-8 - Tomomi Hatano, Tomomi Takezawa, Masashi Sugimoto, Kuangzhe Xu, Takashi Morikawa, Yasuhiro Azuma, Kazuo Shibuta, Noriko Nagata:
Measuring Attractiveness of Tourism Resources by Focusing on Kansei Value Structure: Possibility of Inviting Visitors Using the Japanese Heritage "Ako Salt.". 1-7 - Manav Kaushik, Van Tung Pham, Tran The Anh, Eng Siong Chng:
End-to-End Speaker Age and Height Estimation using Attention Mechanism and Triplet Loss. 1-8 - Yuto Ueda, Hidetoshi Nakashima, Yuuki Yuno, Nobuhiko Hiruma:
Binaural Adaptive Feedback Cancellation Based on Prediction Error Method Using Interaural Level Differences in Hearing Device. 9-16 - Cheng-Yu Cai, Yu-Hui Su, Li Su:
Dual-channel Drum Separation for Low-cost Drum Recording Using Non-negative Matrix Factorization. 17-22 - Daichi Hayakawa, Takehiko Kagoshima, Hiroshi Fujimura:
Mask-based Beamforming Using Complex-valued Neural Network for Recognition of Spatial Target Speech. 23-29 - Toru Takahashi, Takuma Ekawa, Masato Nakayama:
Moving Sound Source Tracking in Wide Space by Multiple Microphone Arrays. 30-35 - Kai Li, Masashi Unoki, Yongwei Li, Jianwu Dang, Masato Akagi:
Study on Simultaneous Estimation of Glottal Source and Vocal Tract Parameters by ARMAX-LF Model for Speech Analysis/Synthesis. 36-43 - Oguz Meteer, Marco Jan Gerrit Bekooij:
Low-Power Booth Multiplication without Dynamic Range Detection in FFTs for FMCW Radar Signal Processing. 44-48 - Xuehan Wang, Gongping Huang, Israel Cohen, Jacob Benesty, Jingdong Chen:
Kronecker Product Adaptive Beamforming for Microphone Arrays. 49-54 - Oguz Meteer, Marco Jan Gerrit Bekooij:
An Optimal Variable-Latency Architecture for Deterministic Approaches to Stochastic Computing with Unary Bit Stream Preserving Properties. 55-62 - Hiroyasu Takagi, Norishige Fukushima:
Domain Specific Description in Halide for Randomized Image Convolution. 63-69 - Kei Kawamura, Kyohei Unno, Yoshitaka Kidani:
Fast Still Picture Coding for VVC. 70-73 - Takumi Kondo, Yoshihiro Maeda, Norishige Fukushima:
Accelerating Finite Impulse Response Filtering Using Tensor Cores. 74-79 - Ippei Okuda, Masahiro Takaoka, Tomoaki Tsumura:
Hisui: an Image and Video Processing Framework with Auto-optimizer. 80-87 - Yoshihiro Maeda, Norishige Fukushima, Takayuki Hamamoto:
Color Transformation for Compressive Computing in Image Filtering. 88-92 - Xumin Yu, Yan Feng, Yanlong Gao:
Imbalanced sample feature enhancement of hyperspectral imagery classification. 93-99 - Jin Wu, Wei Dai, Yu Wang, Bo Zhao:
Improved Fruit Fly Optimization Algorithm Based on Simulated Annealing in Neural Network. 100-105 - Yun Zhu, Chuanzhan Hu, Lin Jiang, Xubang Shen:
An Implementation Method of HEVC Dataflow Graph Based on Reconfigurable Processer. 106-112 - Binghong Jiang:
An improved naive bayes model for air temperature prediction. 113-120 - Rong Yang, Xiaoyan Xie, Miaomiao Chai, Lin Fang, Wanqi He, Jingtao Sun:
An IDE for Reconfigurable Video Array Processor. 121-126 - Xiaoyan Xie, Miaomiao Chai, Zhuolin Du, Kun Yang, Shaorun Yin:
A Reconfigurable Parallelization of Generative Adversarial Networks based on Array Processor. 127-132 - Junyong Deng, Qingqing Ma, Zekun Ye:
Performance Characterization of Rasterization Algorithms for Reconfigurable Graphics Processor. 133-140 - Tse Wei Chiu, You Sheng Guo, Pao-Chi Chang:
Non-parallel Voice Conversion with Generative Attentional Networks. 141-145 - Hyunkook Park, Vien Gia An, Yeong Jun Koh, Chul Lee:
Unpaired Image Demoiréing Based on Cyclic Moiré Learning. 146-150 - Youngjin Oh, Gu Yong Park, Haesoo Chung, Sunwoo Cho, Nam Ik Cho:
Residual Dilated U-Net with Spatially Adaptive Normalization for the Restoration of Under Display Camera Images. 151-157 - Jae Hoon Shim, Hochang Rhee, Yeong Il Jang, Geonsu Lee, Seyun Kim, Nam Ik Cho:
Lossless Image Compression Based on Image Decomposition and Progressive Prediction Using Convolutional Neural Networks. 158-163 - Jintae Kim, Junheum Park, Whan Choi, Chang-Su Kim:
Facial Video Frame Interpolation Combining Symmetric and Asymmetric Motions. 164-169 - Cong Tin Nguyen, Bach-Tung Pham, Thi Phuong Le, Tzu-Chiang Tai, Jia-Ching Wang:
Face Anti-Spoofing Using Multi-Branch CNN. 170-173 - Bungo Konishi, Akira Hirose, Ryo Natsuaki:
Generalization characteristics of complex-valued reservoir computing for interferometric synthetic aperture radar applications. 174-178 - Takehiko Mizoguchi, Isao Yamada:
A Hypercomplex Tensor-SVD and Its Application. 179-186 - Yuto Okawa, Tohru Nitta:
Learning Properties of Feedforward Neural Networks Using Dual Numbers. 187-192 - Akira Hirose, Soshi Shimomura:
Adaptive Subsurface Imaging based on Peak Phase-Profile: The Significance in Separation of Scattering Phase from Propagation Phase. 193-199 - Yicheng Song, Akira Hirose:
Discussion on the Origin of the Strength of Phasor Quaternion Self-Organizing Map. 200-204 - Hiroki Tanji, Takahiro Murakami:
Learning the Statistical Model of the NMF Using the Deep Multiplicative Update Algorithm with Applications. 205-211 - Ryota Kato, Kenji Suyama:
An Improved Parameter Free Genetic Algorithm for CSD-FIR Filter design. 212-217 - Yuta Harigae, Kazuki Matumoto, Kenji Suyama:
A Proposal toward Standardization of Design Examples for IIR Filter Design Methods. 218-221 - Shunsuke Koshita:
On Optimal Realizations for All-Pass Fractional Delay Digital Filters. 222-225 - Takashi Yoshida:
Low-pass maximally flat IIR digital differentiator design with arbitrary flatness degree. 226-231 - Jitendra K. Tugnait:
On Sparse Graph Estimation Under Statistical and Laplacian Constraints. 232-239 - Marisa Mohr, Ralf Möller:
Ordering Principal Components of Multivariate Fractional Brownian Motion for Solving Inverse Problems. 240-247 - Ryu Takeda, Kazuhiro Nakadai, Kazunori Komatani:
Spatial Normalization to Reduce Positional Complexity in Direction-aided Supervised Binaural Sound Source Separation. 248-253 - Tomoro Tanaka, Kohei Yatabe, Yasuhiro Oikawa:
Phase-aware Audio Inpainting Based on Instantaneous Frequency. 254-258 - Koyo Kugiyama, Kimiko Motonaka, Yoshinobu Kajikawa, Seiji Miyoshi:
Statistical-Mechanical Analysis of Adaptive Volterra Filter for Time-Varying Unknown System. 259-263 - Dailys Arronde Pérez, Hubert Zangl:
High-accuracy reconstruction of periodic signals based on compressive sensing. 264-268 - Yih-Wen Wang, Chia-Ping Chen, Chung-Li Lu, Bo-Cheng Chan:
Semi-Supervised Sound Event Detection Using Self-Attention and Multiple Techniques of Consistency Training. 269-274 - Kouki Hori, Nari Tanabe, Masaya Fujisawa:
Nonlinear SVM-Type Automatic Dicision Algorithm in Noisy Environment for Hammering Test System. 275-281 - Yucheng Chen, Mingyi He, Yuchao Dai:
Nearby-person Occlusion Data Augmentation for Human Pose Estimation with Non-extra Annotations. 282-287 - Koki Yasui, Fumihiko Sakaue, Jun Sato, Yu Koyama, Mitsuyasu Matsuura:
Dense Depthmap Prediction from Ultrasonic Sensors. 288-294 - Kazuya Hanamoto, Shuichi Ohno:
Feedback Quantization and Bit Allocation for Networked Control Systems with Rate Limited Channels. 295-298 - Arvid B. Van Den Brink, Marco Jan Gerrit Bekooij:
Enhanced Loop-weakened Belief Propagation Algorithm for Performance Enhanced Polar Code Decoders. 299-304 - Jiyao Liu, Yanxi Zhao, Hao Wu, Dongmei Jiang:
Positional-Spectral-Temporal Attention in 3D Convolutional Neural Networks for EEG Emotion Recognition. 305-312 - Arvid Trapp, Peter Wolfsteiner:
Integrated spectral kurtosis analysis. 313-317 - Arvid B. Van Den Brink, Marco Jan Gerrit Bekooij:
Computational Complexity Reduced Belief Propagation Algorithm for Polar Code Decoders. 318-323 - Katsuki Fukumoto, Koki Yamada, Yuichi Tanaka:
Node Clustering of Time-Varying Graphs Based on Temporal Label Smoothness. 324-329 - Eisuke Yamagata, Shunsuke Ono:
Recovery of Time Series of Graph Signals Over Dynamic Topology. 330-336 - Arjun Ashok Rao, Hoi-To Wai:
An Empirical Study on Compressed Decentralized Stochastic Gradient Algorithms with Overparameterized Models. 337-343 - Cheng Yang, Fen Wang, Minxiang Ye, Guangtao Zhai, Xiao-Ping Zhang, Vladimir Stankovic, Lina Stankovic:
Model Selection-inspired Coefficients Optimization for Polynomial-Kernel Graph Learning. 344-350 - David Bonet, Antonio Ortega, Javier Ruiz Hidalgo, Sarath Shekkizhar:
Channel-Wise Early Stopping without a Validation Set via NNK Polytope Interpolation. 351-358 - Kuangzhe Xu, Noriko Nagata, Toshihiko Matsuka:
Modeling the dynamics of observational behaviors base on observers' personality traits using hidden Markov Models. 359-365 - Kuangzhe Xu, Kenji Katahira, Yoichi Yamazaki, Fan Zhang, Naoki Nishida, Yuichiro Tamai, Naoyuki Matsuzaki, Noriko Nagata:
Estimating Beverage Preference Based on Subjective Emotional Reactions and EEG Activity. 366-372 - Yoshiko Kawabata, Toshihiko Matsuka:
Aizuchi as a sign of internal information processing and its interpretations by listeners. 380-385 - Yuta Watanabe, Yoshitsugu Manabe, Noriko Yata:
Internal state estimation by thermal image and identification of face and nose position. 386-391 - Kei Irie, Yicheng Qiu, Kiyoshi Nishikawa:
On Improving the Accuracy of Object Detection for High Resolution Images Based on SSD. 392-399 - Yuiko Kumagai, Toshihisa Tanaka:
Detection of Note Onsets From EEG While Listening to Music. 400-405 - Yosuke Sugiura, Shunta Nagamori, Tetsuya Shimamura:
Speech Enhancement Network with Unsupervised Attention using Invariant Information Clustering. 406-409 - Ayana Mussabayeva, Zangar Ermaganbet, Prashant Kumar Jamwal, Muhammad Tahir Akhtar:
Event-Related Spectrogram Representation of EEG for CNN-Based P300 Speller. 410-415 - Timur Okhassov, Prashant Kumar Jamwal, Muhammad Tahir Akhtar:
Cost-Effective Proportionate Affine Projection Algorithm with Variable Parameters for Acoustic Feedback Cancellation. 416-422 - Nurbek Saidnassim, Beibit Abdikenov, Rauan Kelesbekov, Muhammad Tahir Akhtar, Prashant Kumar Jamwal:
Self-supervised Visual Transformers for Breast Cancer Diagnosis. 423-427 - Keiko Ochi, Masaki Kojima, Keiho Owada, Nobutaka Ono, Shigeki Sagayama, Hidenori Yamasue:
Pitch and Volume Stability in the Communicative Response of Adults with Autism. 428-432 - Soky Kak, Sheng Li, Masato Mimura, Chenhui Chu, Tatsuya Kawahara:
On the Use of Speaker Information for Automatic Speech Recognition in Speaker-imbalanced Corpora. 433-437 - Hao Shi, Longbiao Wang, Sheng Li, Cunhang Fan, Jianwu Dang, Tatsuya Kawahara:
Spectrograms Fusion-based End-to-end Robust Automatic Speech Recognition. 438-442 - Shengqiang Li, Menglong Xu, Xiao-Lei Zhang:
Conformer-based End-to-end Speech Recognition With Rotary Position Embedding. 443-447 - Shengqiang Li, Menglong Xu, Xiao-Lei Zhang:
Efficient conformer-based speech recognition with linear attention. 448-453 - Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
One In A Hundred: Selecting the Best Predicted Sequence from Numerous Candidates for Speech Recognition. 454-459 - Atsushi Kojima:
Large-Context Automatic Speech Recognition Based on RNN Transducer. 460-464 - Masato Mimura, Shinsuke Sakai, Tatsuya Kawahara:
An End-To-End Model from Speech to Clean Transcript for Parliamentary Meetings. 465-470 - Kento Fujiwara, Ryoichi Takashima, Chihiro Sugiyama, Nobukazu Tanaka, Kanji Nohara, Kazunori Nozaki, Tetsuya Takiguchi:
Data Augmentation Based on Frequency Warping for Recognition of Cleft Palate Speech. 471-476 - Huaibo Zhao, Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi:
An Investigation of Enhancing CTC Model for Triggered Attention-based Streaming ASR. 477-483 - Protima Nomo Sudro, Rohan Kumar Das, Rohit Sinha, S. R. Mahadeva Prasanna:
Significance of Data Augmentation for Improving Cleft Lip and Palate Speech Recognition. 484-490 - Madhu R. Kamble, Shekhar Nayak, M. Ali Basha Shaik, Shakti P. Rath, Vikram Vij, Hemant A. Patil:
Teager Energy Subband Filtered Features for Near and Far-Field Automatic Speech Recognition. 491-496 - Duo Ma, Nana Hou, Van Tung Pham, Haihua Xu, Eng Siong Chng:
Multitask-based joint learning approach to robust ASR for radio communication speech. 497-502 - Daiki Mori, Kengo Ohta, Ryota Nishimura, Atsunori Ogawa, Norihide Kitaoka:
Advanced language model fusion method for encoder-decoder model in Japanese speech recognition. 503-510 - Mirishkar Sai Ganesh, Vishnu Vidyadhara Raju Vegesna, Meher Dinesh Naroju, Sudhamay Maity, Prakash Yalla, Anil Kumar Vuppala:
CSTD-Telugu Corpus: Crowd-Sourced Approach for Large-Scale Speech data collection. 511-517 - Shi-Yan Weng, Hsuan-Sheng Chiu, Berlin Chen:
An Empirical Study on Transformer-Based End-to-End Speech Recognition with Novel Decoder Masking. 518-522 - Guochen Yu, Yutian Wang, Chengshi Zheng, Hui Wang, Qin Zhang:
CycleGAN-based Non-parallel Speech Enhancement with an Adaptive Attention-in-attention Mechanism. 523-529 - Weixin Meng, Chengshi Zheng, Xiaodong Li:
A Robust Maximum Likelihood Distortionless Response Beamformer based on a Complex Generalized Gaussian Distribution. 530-535 - Shih-Chuan Chu, Chung-Hsien Wu, Yun-Wen Lin:
Speech Enhancement Based on Masking Approach Considering Speech Quality and Acoustic Confidence for Noisy Speech Recognition. 536-540 - Xinyang Feng, Nuo Li, Zunwen He, Yan Zhang, Wancheng Zhang:
DNN-Based Linear Prediction Residual Enhancement for Speech Dereverberation. 541-545 - Zhaopeng Qian, Haijun Niu, Li Wang, Kazuhiro Kobayashi, Shaochuan Zhang, Tomoki Toda:
Mandarin Electro-Laryngeal Speech Enhancement based on Statistical Voice Conversion and Manual Tone Control. 546-552 - Lu Zhang, Mingjiang Wang, Andong Li, Zehua Zhang, Xuyi Zhuang:
Incorporating Multi-Target in Multi-Stage Speech Enhancement Model for Better Generalization. 553-558 - Fei Gao, Haixin Guan:
Low-Power Convolutional Recurrent Neural Network For Monaural Speech Enhancement. 559-563 - Quandong Wang, Junnan Wu, Zhao Yan, Sichong Qian, Liyong Guo, Lichun Fan, Weiji Zhuang, Peng Gao, Yujun Wang:
Multi-Channel Speech Enhancement with 2-D Convolutional Time-Frequency Domain Features and a Pre-Trained Acoustic Model. 564-570 - Protima Nomo Sudro, Rohit Sinha, S. R. Mahadeva Prasanna:
Processing Phoneme Specific Segments for Cleft Lip and Palate Speech Enhancement. 571-577 - Sota Misawa, Norihiro Takamune, Tomohiko Nakamura, Daichi Kitamura, Hiroshi Saruwatari, Masakazu Une, Shoji Makino:
Speech Enhancement by Noise Self-Supervised Rank-Constrained Spatial Covariance Matrix Estimation via Independent Deeply Learned Matrix Analysis. 578-584 - Yoshiki Masuyama, Kouei Yamaoka, Yuma Kinoshita, Nobutaka Ono:
Causal Distortionless Response Beamforming by Alternating Direction Method of Multipliers. 585-590 - Jinyoung Lee, Hong-Goo Kang:
Stacked U-Net with High-Level Feature Transfer for Parameter Efficient Speech Enhancement. 591-595 - Hanako Segawa, Li Li, Shoji Makino, Takeshi Yamada:
Extension of virtual microphone technique to multiple real microphones and investigation of the impact of phase and amplitude interpolation on speech enhancement. 597-602 - Kohei Saijo, Kazuhiro Katagiri, Masaru Fujieda, Tetsunori Kobayashi, Tetsuji Ogawa:
Comparative Study on DNN-based Minimum Variance Beamforming Robust to Small Movements of Sound Sources. 603-607 - Kazushi Nakazawa, Kazuhiro Kondo:
Improvements to Non-Intrusive Intelligibility Prediction for Reverberant Speech. 608-613 - Wenjing Yang, Jing Wang, Hongfeng Li, Na Xu, Fei Xiang, Kai Qian, Shenghua Hu:
A Target Speaker Separation Neural Network with Joint-Training. 614-618 - Qian-Bei Hong, Chung-Hsien Wu, Thanh Binh Nguyen, Hsin-Min Wang:
Improvement of Spatial Ambiguity in Multi-Channel Speech Separation Using Channel Attention. 619-623 - Kohei Ozamoto, Kuniaki Uto, Koji Iwano, Koichi Shinoda:
Noise-Tolerant Time-Domain Speech Separation with Noise Bases. 624-629 - Jianyu Wang, Shanzheng Guan, Xiao-Lei Zhang:
Minimum-volume regularized ILRMA for blind audio source separation. 630-634 - Wenbo Zhu, Mou Wang, Xiao-Lei Zhang, Susanto Rahardja:
A comparison of handcrafted, parameterized, and learnable features for speech separation. 635-639 - Masahito Togami, Robin Scheibler:
Over-Determined Semi-Blind Speech Source Separation. 640-645 - Juntao Yu, Ting Jiang, JiaCheng Yu:
Group Multi-Scale convolutional Network for Monaural Speech Enhancement in Time-domain. 646-650 - Yusaku Mizobuchi, Daichi Kitamura, Tomohiko Nakamura, Hiroshi Saruwatari, Yu Takahashi, Kazunobu Kondo:
Prior Distribution Design for Music Bleeding-Sound Reduction Based on Nonnegative Matrix Factorization. 651-658 - Yen-Ju Lu, Yu Tsao, Shinji Watanabe:
A Study on Speech Enhancement Based on Diffusion Probabilistic Model. 659-666 - Xin Fang, Zhen-Hua Ling, Lei Sun, Shutong Niu, Jun Du, Cong Liu, Zhi-Chao Sheng:
A Deep Analysis of Speech Separation Guided Diarization Under Realistic Conditions. 667-671 - Qijie Shao, Jingyong Hou, Yanxin Hu, Qing Wang, Lei Xie, Xin Lei:
Target Speaker Extraction for Customizable Query-by-Example Keyword Spotting. 672-678 - Chen Chen, Nana Hou, Duo Ma, Eng Siong Chng:
Time Domain Speech Enhancement With Attentive Multi-scale Approach. 679-683 - Adrien Llave, Simon Leglaive:
On Speech Sparsity for Computational Efficiency and Noise Reduction in Hearing Aids. 684-688 - Qingjian Lin, Lin Yang, Xuyang Wang, Luyuan Xie, Chen Jia, Junjie Wang:
Sparsely Overlapped Speech Training in the Time Domain: Joint Learning of Target Speech Separation and Personal VAD Benefits. 689-693 - Yuuki Tachioka:
Integration of Annotator-wise Estimations for Emotion Recognition by Using Group Softmax. 694-699 - Xingfeng Li, Taiyang Guo, Xinhui Hu, Xinkang Xu, Jianwu Dang, Masato Akagi:
Hierarchical Prosody Analysis Improves Categorical and Dimensional Emotion Recognition. 700-704 - Simon W. McKnight, Aidan O. T. Hogg, Vincent W. Neo, Patrick A. Naylor:
A Study of Salient Modulation Domain Features for Speaker Identification. 705-712 - Di Wang, Lantian Li, Hongzhi Yu, Dong Wang:
A Study on Decoupled Probabilistic Linear Discriminant Analysis. 713-718 - Yu-Huai Peng, Hung-Shin Lee, Pin-Tuan Huang, Hsin-Min Wang:
Generation of Speaker Representations Using Heterogeneous Training Batch Assembly. 719-724 - Ryotaro Nagase, Takahiro Fukumori, Yoichi Yamashita:
Speech Emotion Recognition with Fusion of Acoustic- and Linguistic-Feature-Based Decisions. 725-730 - Bagus Tris Atmaja, Akira Sasou, Masato Akagi:
Automatic Naturalness Recognition from Acted Speech Using Neural Networks. 731-736 - Purva Barche, Krishna Gurugubelli, Anil Kumar Vuppala:
Comparative Study of Filter Banks to Improve the Performance of Voice Disorder Assessment Systems using LTAS Features. 737-742 - Xiaoquan Ke, Man-Wai Mak, Jinchao Li, Helen M. Meng:
Dual Dropout Ranking of Linguistic Features for Alzheimer's Disease Recognition. 743-749 - Zhaohang Zhang, Xiaohui Zhang, Min Guo, Wei-Qiang Zhang, Ke Li, Yukai Huang:
A Multilingual Framework Based on Pre-training Model for Speech Emotion Recognition. 750-755