


default search action
APSIPA 2024: Macau
- Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2024, Macau, December 3-6, 2024. IEEE 2025, ISBN 979-8-3503-6733-1

- Tsun-Hin Cheung

, Ka-Chun Fung, Songjiang Lai, Kwan-Ho Lin, Vincent T. Y. Ng, Kin-Man Lam:
Automatic Prompt Generation and Grounding Object Detection for Zero-Shot Image Anomaly Detection. 1-6 - Vu-An Hoang, Minh-Hanh Tran

, Viet Hang Dao, Thanh-Hai Tran:
GILED: Lesion Detection of Gastrointestinal Tract from Endoscopic Images and Medical Notes. 1-6 - Yutsuki Takeuchi, Taishi Nakashima, Nobutaka Ono, Takashi Takazawa, Shuhei Shimanoe, Yoshinori Tsuchiya:

Experimental Evaluation of Speech Enhancement for In-Car Environment Using Blind Source Separation and DNN-based Noise Suppression. 1-6 - Wataru Nakata, Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:

NecoBERT: Self-Supervised Learning Model Trained by Masked Language Modeling on Rich Acoustic Features Derived from Neural Audio Codec. 1-6 - Gen Sato, Yusuke Ikeda

:
Data-Driven Physics-Informed Neural Network for Sound Field Estimation in Rooms of Arbitrary Size. 1-5 - Jong In Kim, Sunhee Kim, Minhwa Chung:

Generating Phonetic Transcriptions for Korean English L2 Learners Using Multiple Self-Supervised-Model-Based ASR Systems and Rover Method. 1-6 - Jiaming Zhang, Jijie Wu, Xiaoxu Li:

Visual semantic alignment network based on pre-trained ViT for few-shot image classification. 1-6 - Song-Jiang Lai, Tsun-Hin Cheung

, Ka-Chun Fung, Tian-Shan Liu, Kin-Man Lam:
An End-to-End Two-Stream Network Based on RGB Flow and Representation Flow for Human Action Recognition. 1-6 - Libo Zhang, Yuxuan Han, Wenbin Lin, Jingwang Ling, Feng Xu:

PRTGaussian: Efficient Relighting Using 3D Gaussians with Precomputed Radiance Transfer. 1-6 - Woon-Seng Gan, Santi Peksi, Chung Kwan Lai, Yen Theng Lee, Dongyuan Shi, Bhan Lam:

A Real-Time Platform for Portable and Scalable Active Noise Mitigation for Construction Machinery. 1-6 - Daisuke Minami, Kiyoshi Nishikawa:

YOLO for High Resolution Images without Retraining. 1-6 - Huiyong Bak, Changhyeon Jeong:

Effective Speech Data Augmentation Method To Improve Customer Service Representative Speech Recognition System Performance. 1-5 - Arth J. Shah, Hemant A. Patil:

Significance of Lower Frequency Regions for Audio Deepfake Detection. 1-6 - Sibusiso Reuben Bakana, Yongfei Zhang, Bhekisipho Twala:

WildPose: HRNet-based Lightweight and Efficient Wildlife Pose Estimation. 1-6 - Shogo Seki, Li Li:

Inference Efficient Source Separation Using Input-dependent Convolutions. 1-5 - Pham Minh Tuan, Mouloud Adel, Nguyen Linh Trung, Eric Guedj:

Does Brain Atlas Choice Matter? An Empirical Study in Alzheimer's Diagnosis Using FDG-PET Images. 1-6 - Fan Zhang, Jacob Benesty, Chao Pan, Jingdong Chen:

New Perspectives and Insights on Distortionless Microphone Array Beamforming. 1-5 - Kyungjune Lee, Mingyu Jang, Jungwoo Huh, Jeonghaeng Lee

, Seokkeun Choi, Sanghoon Lee:
MYMV: A Music Video Generation System with User-preferred Interaction. 1-4 - Ruxin Zheng, Saeid Sanei:

Separation of Cardiopulmonary Sound Signals for Classification of Respiratory Diseases. 1-6 - Benjamin Yen, Kazuhiro Nakadai:

Drone audition: implementation of an indoor multi-drone system for sound source tracking. 1-6 - Chung-Wen Wu, Berlin Chen:

Layer-Wise Feature Distillation with Unsupervised Multi-Aspect Optimization for Improved Automatic Speech Assessment. 1-5 - Jingyuan Tang, Songlin Sun:

Forward Prediction-Guided Cross-Partition Targeted Pruning for VVenC. 1-6 - Pengyu Cheng, Zhenhua Ling, Meng Meng, Yujun Wang:

Disentangling Speaker Representations from Intuitive Prosodic Features for Speaker-Adaptative and Prosody-Controllable Speech Synthesis. 1-6 - Mingjun Zhang, Yan Feng, Yu Gao, Longting Xu:

Non-Target Conversion Based Speech Steganography for Secure Speech Communication System. 1-6 - Aoto Yasue, Benjamin Yen, Katsutoshi Itoyama, Kazuhiro Nakadai:

LCMV-based Scan-and-Sum Beamforming for Region Source Extraction. 1-6 - Rintaro Takata, Yoshikazu Washizawa:

Complex CNN incorporating Hilbert transform for steady-state visual evoked potential BCI. 1-6 - Van-De Nguyen, Minh-Huong Hoang Dang, Quang-Huy Nguyen

, Manh Cuong Dinh, Thanh-Ha Do:
Enhancing Cell Segmentation using Deep Learning Models by Custom Processing Techniques. 1-5 - Changsheng Chen, Xijin Li:

Robust Image Watermarking Scheme under Halftone Distortion with Surrogate Model. 1-6 - Thi-Loan Pham, Gia-Minh Pham, Tien-Dat Nguyen, Van-Hung Le, Thi-Lan Le, Duy-Hai Vu, Hai Vu, Chi-Mai Pham, Thanh-Hai Tran:

Data Augmentation and Assessment for Enhanced Ovarian Tumor Classification. 1-6 - Haopeng Geng, Daisuke Saito, Nobuaki Minematsu:

A Pilot Study of Applying Sequence-to-Sequence Voice Conversion to Evaluate the Intelligibility of L2 Speech Using a Native Speaker's Shadowings. 1-6 - Hengyi Zou, Sayaka Shiota:

Vocal Tract Length Perturbation-based Pseudo-Speaker Augmentation Considering Speaker Variability for Speaker Verification. 1-6 - Kangjian Huang, Yan Yang, Yongquan Jiang, Xiaobo Zhang, Zhuyi Angelina Li:

AFSDet: Video Small Object Detection Based on Adaptive Focused Slicing. 1-6 - Yizhou Peng, Eng Siong Chng:

Optimizing Multi-Speaker Speech Recognition with Online Decoding and Data Augmentation. 1-6 - Chihiro Watanabe, Hirokazu Kameoka:

GE2E-AC: Generalized End-to-End Loss Training for Accent Classification. 1-6 - Shingo Takemoto, Shunsuke Ono:

Rotation Invariant Spatio-Spectral Total Variation for Hyperspectral Image Denoising. 1-6 - Weiyi Xia, Satoru Fujita:

Cuisine Image Synthesis with Improved Multiscale GANs Guided by CLIP. 1-6 - Nanako Imaichi, Toru Nakashika:

Gamma-VAE: Speech representation based on VAE assuming gamma distribution for both latent variables and observation. 1-6 - Naijian Cao, Renjie He, Yuchao Dai, Mingyi He:

LoFLAT: Local Feature Matching using Focused Linear Attention Transformer. 1-6 - Hiroto Horimoto, Ryusei Kimura, Takahiro Tanaka, Shogo Okada:

Psychological Driving Style Estimation from GPS Sensor Data Alone. 1-6 - Seyun Um, Miseul Kim, Doyeon Kim, Hong-Goo Kang:

Bluemarble: Bridging Latent Uncertainty in Articulatory-to-Speech Synthesis with a Learned Codebook. 1-6 - Mingzhou He

, Haojie Wang, Shuchang Zhou, Qingbo Wu, King Ngi Ngan, Fanman Meng, Hongliang Li:
Inertial Strengthened CLIP model for Zero-shot Multimodal Egocentric Activity Recognition. 1-6 - Guangwei Zhang, Yongping Xiong, Ruifan Li:

A Noisy Context Optimization Approach for Chinese Spelling Correction. 1-6 - Toru Takahashi, Eita Morigaki, Masato Nakayama:

Impulse response transforming method to control distance perception based on direct-to-reverberant energy ratio. 1-6 - Kenta Iwai, Takanobu Nishiura:

Performance Evaluation of Acoustic Echo and Noise Canceller with Variable-Step-Size Shared-Error NLMS Algorithm under Double-Talk Conditions. 1-5 - Longting Xu, Mingjun Zhang, Wenbin Zhang, Tianyi Wang, Jiawei Yin, Yu Gao:

Personal Voice Activity Detection With Ultra-Short Reference Speech. 1-6 - Naoki Koga, Yoshiaki Bando, Keisuke Imoto:

LEAD Dataset: How Can Labels for Sound Event Detection Vary Depending on Annotators? 1-6 - Yuhang Yang, Yizhou Peng, Hao Huang, Eng Siong Chng, Xionghu Zhong:

Adapting OpenAI's Whisper for Speech Recognition on Code-Switch Mandarin-English SEAME and ASRU2019 Datasets. 1-6 - Sota Hirata, Norihiro Takamune, Kouei Yamaoka, Daichi Kitamura, Hiroshi Saruwatari, Yu Takahashi, Kazunobu Kondo:

Auxiliary-Function-Based Steering Vector Estimation Method for Spatially Regularized Independent Low-Rank Matrix Analysis. 1-6 - Siddharth Harsh Y. Malhotra

, Sapan H. Mankad:
Audio Similarity Detection. 1-6 - So-Yeon Jang, Jong-Ok Kim:

Color Guided Disease Segmentation for Plant Images. 1-6 - Xuping Huang, Akinori Ito:

A Study on Variable Embedding Locations of Reversible Spectral Speech Watermarking. 1-6 - Kai-Wei Huang, Chia-Ping Chen:

Long Audio File Speaker Diarization with Feasible End-to-End Models. 1-6 - Hong-Jie Hu, Yu-Chiao Lai, Chia-Ping Chen:

Enhancing Branchformer with Dynamic Branch Merging Module for Code-Switching Speech Recognition. 1-6 - Satoru Fujita, Keizo Oyama:

Learning a Sequence of Cursive-Style Japanese Characters in Classical Literary Works. 1-6 - Yike Chen, Yuru Song, Peijia Zheng, Yusong Du, Weiqi Luo

:
Privacy-Preserving Anomaly Detection in Bitstream Video based on Gaussian Mixture Model. 1-6 - Tsugumasa Yutani, Yuya Yamamoto, Shuyo Nakatani, Hiroko Terasawa:

Wavetable Synthesis Using CVAE for Timbre Control Based on Semantic Label. 1-6 - Zuhai Zhang, Luheng Jia, Li Song, Shuyuan Zhu, Yuanfang Guo, Kebin Jia:

Dictionary Learning Based Two-stage Near-lossless Video Compression. 1-6 - Shogo Mito, Miho Miyajima, Hirofumi Tomioka, Hitomi Sato, Takashi Takeuchi, Hitoshi Muto, Yuji Kabasawa, Hiroyuki Harada, Kana Eguchi

, Shota Kato, Manabu Kano:
Postoperative Delirium Prediction Based on Preoperative Electrocardiogram and Electroencephalogram. 1-5 - Mengting Chen, Ziping Zhao:

Sparse Blind Deconvolution and Demixing via Block Majorization-Minimization. 1-6 - Xingyu Shen

, Wei-Ping Zhu:
Multichannel Speech Enhancement Using Complex-Valued Graph Convolutional Networks and Triple-Path Attentive Recurrent Networks. 1-6 - Po-Cheng Chan, Chung-Li Lu, Jia-Ching Wang:

Detecting Abnormal Machine Sounds Using An Ensemble Approach with Data Augmentation Techniques. 1-4 - Rei Aso, Sakaya Shiota, Hitoshi Kiya:

Disposable-key-based image encryption for collaborative learning of Vision Transformer. 1-6 - Geeta Sai Sahasra, Kadwasra Swapna, Arushi Srivastava, Aditya Pusuluri, Hemant A. Patil:

Comparative Analysis of Glottal and Vocal Tract Features in Dysarthria. 1-6 - Keitaro Yamashita, Kazuki Naganuma, Shunsuke Ono:

Generalized Graph Signal Sampling under Subspace Priors by Difference-of-Convex Minimization. 1-6 - Yuanxi Lin, Yuriy Evgenyevich Gapanyuk:

Frequency & Channel Attention Network for Small Footprint Noisy Spoken Keyword Spotting. 1-6 - Jia-Liang Lu, Bi-Cheng Yan, Yi-Cheng Wang, Tien-Hong Lo, Hsin-Wei Wang, Li-Ting Pai, Berlin Chen:

EADSum: Element-Aware Distillation for Enhancing Low-Resource Abstractive Summarization. 1-6 - Tsubasa Yano, Benjamin Yen, Kazuhiro Nakadai:

Drone audition: dataset and methods for ground surface material classification using drone noise in outdoor environment. 1-6 - Yuto Ishikawa, Osamu Take, Tomohiko Nakamura

, Norihiro Takamune, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Real-Time Noise Estimation for Lombard-Effect Speech Synthesis in Human-Avatar Dialogue Systems. 1-6 - Sara Kashiwagi, Keitaro Tanaka, Shigeo Morishima:

Capturing Dynamic Identity Features for Speaker-Adaptive Visual Speech Recognition. 1-6 - Sheng Li, Yuka Ko, Akinori Ito

:
LLM as decoder: Investigating Lattice-based Speech Recognition Hypotheses Rescoring Using LLM. 1-5 - Sifan Wu, Li Dong, Diqun Yan, Rangding Wang:

Normalizing Flows-Based Latent Variable Rearrangement for Generative Image Steganography. 1-6 - Beizuo Zhu, Kazunori Hayashi, Hiroki Mori:

Reduced-dimensional MUSIC Algorithm for Frequency Diverse Array in MIMO Radar System. 1-8 - Yen-Chou Pan, Yih-Liang Shen, Yuan-Fu Liao, Tai-Shih Chi:

Band-Split Inter-SubNet: Band-Split with Subband Interaction for Monaural Speech Enhancement. 1-6 - Akumalla Brahma Reddy, Bach-Tung Pham, Tung-Yu Zhuang, Bima Paristao, Pao-Chi Chang, Jia-Ching Wang:

Leveraging Attention Mechanisms for Breast Cancer Diagnosis. 1-4 - Zekun Yang, Jiajun He, Tomoki Toda:

Multi-Modal Video Summarization Based on Two-Stage Fusion of Audio, Visual, and Recognized Text Information. 1-6 - Hana Lebeta Goshu, Jun Xiao, Kin-Chung Chan, Cong Zhang, Mulugeta Tegegn Gemeda, Kin-Man Lam:

NeRF-FCM: Feature Calibration Mechanisms for NeRF-based 3D Object Detection. 1-6 - Jing Liang, Libo Wang

, Peiya Li:
Fine-Grained Privacy-Preserving Image Retrieval in Cloud Environment. 1-6 - Hang Sheng, Qinji Shu, Hui Feng, Bo Hu:

Subset Random Sampling of Finite Time-vertex Graph Signals. 1-6 - Rohini Sri Mannepalli, Aditya Pusuluri, Hemant A. Patil:

Dysarthria Severity Classification Using Phase Based Features of LP Residual. 1-5 - Divesh Lala, Koji Inoue, Haruki Kawai, Zi Haur Pang, Mikey Elmers

, Tatsuya Kawahara:
Development and evaluation of a semi-autonomous parallel attentive listening system. 1-6 - Michaël Antonie van Wyk, André Martin McDonald, David M. Rubin, Fangfang Zhang:

Novel Estimators for the Number of Susceptible Individuals in SIR Models of Infectious Epidemics. 1-6 - Fan Zhang, Chao Pan, Jingdong Chen, Jacob Benesty:

Low-Complexity Adaptive Beamformer for Joint Reverberation and Noise Suppression. 1-5 - Keiji Yamadera, Michiharu Niimi:

Improved Ultimate Link without Markers for Projective Transformation. 1-6 - Haibin Wu, Huang-Cheng Chou, Kai-Wei Chang, Lucas Goncalves, Jiawei Du, Jyh-Shing Roger Jang, Chi-Chun Lee, Hung-Yi Lee:

Empower Typed Descriptions by Large Language Models for Speech Emotion Recognition. 1-6 - Hau Joan, Yiqi Tew

, Li Peng Tan:
Innovative Information Hiding in H.266/VVC using Sub-Block Transform Technique. 1-6 - Duhyun Kim, Jae-Young Sim:

Confidence-Aware Learning for Person Re-identification with Noisy Labels. 1-5 - Koichi Nishikawa, Shinsuke Ibi, Takumi Takahashi, Hisato Iwai:

Blind Self-Interference Analog Canceller with Differential Delay for Backscatter Communications. 1-6 - Trio Adiono, Erwin Setiawan, Michael Jonathan, Rahmat Mulyawan, Nana Sutisna, Infall Syafalni, Wasiu O. Popoola:

A Configurable OFDM Baseband Processor for RF-UOWC System-on-Chip. 1-4 - Justin Tomoya Wulf, Tetsuro Kitahara:

Analyzing House Music: Relations of Audio Features and Musical Structure. 1-5 - Jinzhuo Yao, Hongqing Liu, Yi Zhou, Lu Gan, Junkang Yang

:
Diverse Time-Frequency Attention Neural Network for Acoustic Echo Cancellation. 1-6 - Quoc Anh Le, Hong-Thinh Nguyen:

New approach for Alzheimer's disease classification using topographic maps and deep learning model. 1-6 - Malik Akbar Hashemi Rafsanjani, Candy Olivia Mawalim, Dessi Puji Lestari, Sakriani Sakti, Masashi Unoki:

Unsupervised Anomalous Sound Detection Using Timbral and Human Voice Disorder-Related Acoustic Features. 1-6 - Ryosuke Onizawa, Gen Sato, Izumi Tsunokuni, Yusuke Ikeda

:
Physics-Informed Neural Networks for Estimation of Scattered Sound Fields with Boundary Condition. 1-5 - Nao Harada, Rinka Kawano, Masaki Kawamura

:
Proposal of Blind Extractable Additive Video Watermarking Method. 1-6 - Daimin Shi, Xiaoyong Lu, Yang Liu, Jingyi Yuan, Tao Pan:

Speech Depression Recognition from the Selfreference Effect Using LSTM with ResNet. 1-5 - Hung-Phong Tran, Thi-Hoai Phan, Thuy-Binh Nguyen, Thi-Ngoc-Diep Do, Hong-Quan Nguyen, Thanh-Hai Tran, Hien-Thanh Duong, Thi-Lan Le:

M-IRRA: A multilingual model for Text-based Person Search. 1-6 - Ryota Imanaka, Yuting Geng, Masato Nakayama, Takanobu Nishiura:

Augmented sound-image perception using pre-virtual-leading ultrasounds based on precedence effect. 1-6 - Tsung-Shan Yang, Yun-Cheng Wang, Chengwei Wei, Suya You, C.-C. Jay Kuo:

GMA: Green Multi-Modal Alignment for Image-Text Retrieval. 1-6 - Junda Zhu, Shisheng Guo, Longzhen Tang, Guolong Cui:

Multi-Channel Fusion Human Activity Recognition Algorithm Based on Millimeter-Wave Radar. 1-6 - Kwok Chin Yuen, Sheng Li, Jia Qi Yip, Engsiong Chng:

Low-resource Language Adaptation with Ensemble of PEFT Approaches. 1-6 - Hiromi Shidara, Kanta Miura, Takuro Ishii, Koichi Ito, Takafumi Aoki, Yoshifumi Saijo, Jun Ohmiya:

Performance Improvement of Single Plane-Wave Imaging Using U-Net and Discrete Wavelet Transform. 1-6 - Hualin Ren

, Christian H. Ritz, Jiahong Zhao
, Xiguang Zheng, Daeyoung Jang:
Generating Room Impulse Responses Using Neural Networks Trained with Weighted Combinations of Acoustic Parameter Loss Functions. 1-6 - Liwen Tang, Dingchang Zheng

, Fei Chen:
Iterative Demographic Attentional Feature Fusion-based CNN and Transformer Network for Accurate Cuffless Blood Pressure Estimation. 1-5 - Yuxin Wang, Shuolin Yang, Qianxi Wu, Zhishuo Zhang, Yunxia Liu, Yang Yang, Yakui Dong, Cheng Fei, Junliang Liu, Lili Wang, Shuzhen Fan, Yongfu Li:

A Semi-supervised Low-Light Image Enhancement with Color Guidance. 1-6 - Xiangjie Sui

, Shiqi Wang, Yuming Fang:
A Survey on Objective Quality Assessment of Omnidirectional Images. 1-6 - Xingfeng Li

, Xiaohan Shi, Yuke Si, Zilong Zhang, Feifei Cui, Yongwei Li, Yang Liu, Masashi Unoki, Masato Akagi:
BEES: A New Acoustic Task for Blended Emotion Estimation in Speech. 1-6 - Jiawei Yin, Wenbin Zhang, Mingjun Zhang, Yu Gao:

Self-Supervised Augmented Diffusion Model for Anomalous Sound Detection. 1-5 - Koki Aoyama, Koichi Adachi:

Collection of Correlated Information from Superimposed Multiple Chirp Signals. 1-6 - Hualin Ren

, Christian H. Ritz, Jiahong Zhao
, Xiguang Zheng, Daeyoung Jang:
Towards a B-format Ambisonic Room Impulse Response Generator Using Conditional Generative Adversarial Network. 1-6 - Shao-Yun Luo, Kuei-Chen Chen, Jian-Jiun Ding, Cheng-Che Lee, Hsin-Jung Lee:

High and Low Frequency Region Separation Method for Adaptive Image Expansion. 1-6 - Yujin Han, Taewan Kim:

New Abnormal Behavior Detection for Patient Surveillance System. 1-5 - Jae Hoon Shim, Min Woo Kim, Tae Gyu Lim, Byungseok Min, Sang Hwa Lee, Nam Ik Cho:

Enhancing Semiconductor X-RAY Images: A Framework Combining Denoising and Super-Resolution Modules With a Novel Dataset. 1-6 - Kyoka Kazama, Taishi Nakashima, Nobutaka Ono:

Measurement of Relative Transfer Function for Own Voice in Head-Mounted Microphone Array. 1-5 - Hayata Nakano, Yuting Geng, Kenta Iwai, Takanobu Nishiura:

Sound Quality Improvement in Visual Microphone by Emphasizing Focused Area Based on Focal Rate. 1-6 - Aulia Adila, Candy Olivia Mawalim, Masashi Unoki:

Detecting Spoof Voices in Asian Non-Native Speech: An Indonesian and Thai Case Study. 1-6 - Junting Wang, Satoko Koganemaru, Atsushi Shima, Yedi Cao, Kana Hirakawa, Ken Iwagana, Atsushi Suehiro, Keiko Maekawa, Tatsuya Mima, Yumie Ono:

Effect of Phase-Locked Transcranial Alternating Current Stimulation on Vocal tremor. 1-6 - Kun-Lin Tsai, Chao-Ting Huang:

Optimizing Computational Efficiency: In-Memory Computing with Dynamic Switching. 1-6 - Kazuki Naganuma, Shunsuke Ono:

Hyperspectral Unmixing With Row-Sparsity Enhancement: A Difference-of-Convex Approach. 1-5 - Ryusei Terui, Takeshi Yamada:

Speech emotion recognition based on crossmodal transformer and attention weight correction. 1-5 - Yuanyang Qi, Saeid Sanei:

Murmur Separation and Classification from Heart Sound Using Constrained Singular Spectrum Analysis and Wavelet Transform. 1-5 - Po Cheng Chan, Wei-Yu Chen, Chung Li Lu, Hsiang-Feng Chuang, Yu-Han Cheng, Jia-Ching Wang:

Integrating VGGSK and BEATs for Enhanced Sound Event Detection: A Semi-Supervised GRU-Based System with Weak Labels and Synthetic Soundscapes. 1-5 - James Gong, Bruce Li, Waleed Abdulla:

Optimising Neural Networks with Fine-Grained Forward-Forward Algorithm: A Novel Backpropagation-Free Training Algorithm. 1-6 - Li Du, Chao Pan, Lijun Zhang:

Wind Noise Reduction with Orthogonal Polynomial Expansion. 1-5 - Lo-Ya Li, Tien-Hong Lo, Jeih-Weih Hung, Shih-Chieh Huang, Berlin Chen:

Few-Shot Open-Set Keyword Spotting with Multi-Stage Training. 1-5 - Kenta Takahashi, Wataru Nakamura:

A Quasilinear-Time CVP Algorithm for Triangular Lattice Based Fuzzy Extractors and Fuzzy Signatures. 1-4 - Ji Qi

, Huisheng Wang, H. Vicky Zhao:
ViP-CBM: Reducing Parameters in Concept Bottleneck Models by Visual-Projected Embeddings. 1-6 - Takatoshi Obata, Osamu Takyu, Kei Inage, Takeo Fujii, Kohei Yoshida, Masayuki Ariyoshi:

Observation of the Terrestrial Radio Environment Using the Low Earth Orbit Satellite Constellation. 1-5 - Huang-Cheng Chou:

A Tiny Whisper-SER: Unifying Automatic Speech Recognition and Multi-label Speech Emotion Recognition Tasks. 1-6 - Nimol Thuon

, Jun Du:
KhmerFormer: Multi-Scale CNNs-Transformer with External Attention for Ancient Khmer Palm Leaf Isolated Glyph Classification. 1-6 - Zihang Lyu, Jun Xiao, Cong Zhang, Kin-Man Lam:

AI-generated image detectors are surprisingly easy to mislead... for now. 1-5 - Meet H. Soni, Ashish Panda, Sunil Kumar Kopparapu:

Generalized SpecAugment: Robust Online Augmentation Technique for End-to-End Automatic Speech Recognition. 1-5 - Arth J. Shah, Prathav Kevadiya, Hemant A. Patil:

Pop Noise Detection Using Group Delay Cepstral Coefficients. 1-6 - Chengzhe Shi, Wensheng Pan, Wanzhi Ma, Ying Liu, Qiang Xu, Zhiya Zhang, Shihai Shao:

A High-Isolation Sub-6 GHz In-Band Full-Duplex Communication System. 1-6 - Zhentao Lin, Zihao Chen, Bi Zeng, Leqi Chen, Jia Cai:

Performance Optimization in the Cascade of VAD and ASR Systems: A Study on Evaluation and Alignment Strategies. 1-6 - Liyuan Zhang, Xianrui Wang, Yichen Yang, Tetsuya Ueda, Shoji Makino, Jingdong Chen:

Heavy-tailed Distributions-Based Online Semi-blind Source Separation for Nonlinear Echo Cancellation. 1-5 - Yue-Yang He, Bi-Cheng Yan, Tien-Hong Lo, Meng-Shin Lin, Yung-Chang Hsu, Berlin Chen:

JAM: A Unified Neural Architecture for Joint Multi-granularity Pronunciation Assessment and Phone-level Mispronunciation Detection and Diagnosis Towards a Comprehensive CAPT System. 1-6 - Nutchanon Siripool, Suradej Duangpummet, Jessada Karnjana, Waree Kongprawechnon, Masashi Unoki:

Blind Estimation of Room Volume from Reverberant Speech Based on the Modulation Transfer Function. 1-6 - Zhe Xiao, Zongqi He

, Zhuoning Xu
, Yunze Li, Zelin Song, Calvin Leighton, Li Wang, Shanru Liu, Shiun Yee Wong, Wenfeng Huang, Wenjing Jia
, Kin-Man Lam:
A Multi-Perceptual Learning Network for Retina OCT Image Denoising and Classification. 1-6 - Xue Yang, Changchun Bao, Xu Zhang, Xianhong Chen:

Target Speaker Extraction Method by Emphasizing the Active Speech with an Additional Enhancer. 1-6 - Yuto Ashikawa, Takashi Ito, Shohei Ishizu, Yosuke Kurihara:

A method for classification NEO-FFI answers fabricated and advantageous due to psychological bias using brainwave specific brain activity networks. 1-4 - Jintang Xue, Yun-Cheng Wang, Chengwei Wei, C.-C. Jay Kuo:

Efficient Feature Selection for Word Embedding Dimension Reduction. 1-6 - Felix Ming-Fei Duan, Wan-Chi Siu, Chun Chuen Hui:

New approach on Smiling faces with Domain Transfer in Latent Space. 1-5 - Xinyu Wang, Hong-Shuo Chen, Zhiruo Zhou, Suya You, Azad M. Madni, C.-C. Jay Kuo:

Green Video Camouflaged Object Detection. 1-6 - Junkang Yang

, Hongqing Liu, Lu Gan, Yi Zhou, Xing Li, Jie Jia, Jinzhuo Yao:
SDNet: Noise-Robust Bandwidth Extension under Flexible Sampling Rates. 1-6 - Yuhang Zhang, Yuanman Li, Li Dong, Xia Li:

Robust Watermarking via Dual Guidance. 1-6 - Shu-Ping Chang, Cheng-Che Lee, Hsin-Jung Lee, Chieh-Hsiung Kuan, Jason Gemsun Young, Chia-Yu Yao, Jian-Jiun Ding:

An Annealing-Inspired Gradient-Descent Based Suboptimal Solver for Combinatorial Problems. 1-6 - Tomoaki Mizuno, Takuya Kishida, Natsue Yoshimura, Toru Nakashika:

An Investigation on the Speech Recovery from EEG Signals Using Transformer. 1-6 - Xiaoqing Tong, Kazunori Hayashi:

Deep Unfolding Aided Parameter Optimization for Multi-task Diffusion LMS Algorithm. 1-6 - Tomohiro Ariga, Reo Minakawa, Kazunori Kojima, Shi-wook Lee, Yoshiaki Itoh:

Keyword spotting for dialectal speech and Introduction of wav2vec2.0. 1-5 - Yoto Ikezaki, Yuting Geng, Masato Nakayama, Takanobu Nishiura:

Virtual multi-boosted amplitude modulation toward high-pressure audible sound with parametric array loudspeakers. 1-6 - Chenxing Li, Manjie Xu, Dong Yu:

SRC-gAudio: Sampling-Rate-Controlled Audio Generation. 1-6 - Teng-Kuan Huang, Mei-Chen Yeh:

Improving Semi-Supervised Object Detection by ROI-Enhanced Contrastive Learning. 1-6 - Seyun Um, Yongju Lee, WooSeok Ko, Yuan Zhou, Sangyoun Lee, Hong-Goo Kang:

EavaNet: Enhancing Emotional Facial Expressions in 3D Avatars through Speech-Driven Animation. 1-6 - Junwu Huang, Zhexiong Wan, Zhicheng Lu, Juanjuan Zhu, Mingyi He, Yuchao Dai:

Ev3DGS: Event Enhanced 3D Gaussian Splatting from Blurry Images. 1-6 - Yuru Song, Yike Chen, Peijia Zheng, Yusong Du, Weiqi Luo

:
Secure Moving Object Detection Transformer in Compressed Video with Feature Fusion. 1-6 - Rashed Iqbal

, Christian H. Ritz, Jack Yang
, Sarah K. Howard:
Few-Shot Audio Classification Model for Detecting Classroom Interactions Using LaSO Features in Prototypical Networks. 1-6 - Mary Josy John, Imad Barhumi:

Handling Missing Data in Limited-View Photoacoustic Tomography Using Compressive Sensing Algorithm-Based Deep Learning. 1-6 - Jonghwan Na, Yeseul Park, Bowon Lee:

A Comparative Study on the Biases of Age, Gender, Dialects, and L2 speakers of Automatic Speech Recognition for Korean Language. 1-6 - Jinyi Mi, Xiaohan Shi, Ding Ma, Jiajun He, Takuya Fujimura, Tomoki Toda:

Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions. 1-6 - Ken Kalang Al Qalyubi, Nur Ahmadi, Dessi Puji Lestari:

Comparative Evaluation of Fine-Tuned Hybrid Transformer and Band-Split Recurrent Neural Networks for Music Source Separation. 1-5 - Wing-Ho Cheng, Wan-Chi Siu, H. Anthony Chan:

High-Quality Facial Pose Generation with Latent Space Processing. 1-7 - Haonan Hu, Ziye Yang, Jie Chen, Lijun Zhang:

Speech Dereverberation with Deconvolution Regularized by Denoising. 1-6 - Mas Ira Syafila Mohd Hilmi Tan, Lai-Kuan Wong, Yuen Peng Loh, Chih-Yang Pee:

Enhancing Early Plant Disease Detection: 1D to 2D Spectral Transformations. 1-6 - Tetsuya Asakawa, Masashi Hashimoto, Takeshi Miyaji, Kazuki Shimizu, Kei Nomura, Masaki Aono:

Real-time Segmentation of Coronary Artery Calcification Using Spatial Attention and Parallel Convolution. 1-5 - Kento Masuda, Kazumasa Yamamoto

, Seiichi Nakagawa:
Data Augmentation Methods and Influence of Speech Recognition Performance for TED Talk's English to Japanese Speech Translation. 1-6 - Nana Sutisna, Aditya Prawira Nugroho, Christopher Jeffrey, Patrick Amadeus Irawan, Rizky Ramadhana, Ronggur Mahendra, Michael Jonathan, Infall Syafalni, Trio Adiono:

Leveraging IoT and Machine Learning for Efficient Rice Stock Monitoring and Prediction. 1-6 - Trong-Duc Nguyen, Tien-Dung Do, Thanh-Ha Do:

Automated Pseudo-Label Generation and Parallel Computing for Enhanced Few-Shot Medical Image Segmentation. 1-6 - Menghan Li, Zhihua Huang:

WavLM and Omni-Scale CNNs: Enhancing Boundary Detection in Partially Spoofed Audio. 1-5 - Qing Feng, Zhiqiang Wu, Xuebin Li, Heping Shen, Liushang, Tangmin, Shengquan Feng:

Temporal-Spatial Correlation Analysis for Ship-Radiated Noise Based on Random Matrix Theory. 1-6 - Jen-Tzung Chien

, Yi-Chien Wu:
Empathetic Response Generation via Regularized Q-Learning. 1-6 - Shuting Hao, Daisuke Saito, Nobuaki Minematsu:

Enhancing Acoustic Scene Classification with Layer-wise Fine-Tuning on the SSAST Model. 1-6 - Tianwei Zhang

, Lianru Gao, Xu Sun, Lina Zhuang
:
Tiny Object Detection Enhancement for Large-Scale Remote Sensing Imagery. 1-5 - Xinqi Jiang, Jinyu Tian:

Source Attribution for Images Generated by Diffusion-Based Text-to-Image Models: Exploring the Forensics Approach. 1-6 - Jia Qi Yip, Kwok Chin Yuen, Bin Ma, Engsiong Chng:

Speech Separation using Neural Audio Codecs with Embedding Loss. 1-6 - Koki Horikoshi, Gen Sato, Izumi Tsunokuni, Yusuke Ikeda

:
Pressure Matching Using Data-Driven Estimation for Sound Fields and Transfer Functions. 1-5 - Siyang Qi, Fei Wang, Hongzhi Sun, Yang Ge, Bo Xiao:

GVDIE: A Zero-Shot Generative Information Extraction Method for Visual Documents Based on Large Language Models. 1-6 - Kohei Hayashi, Soichiro Honda, Hirokazu Kamei, Yoshihiro Maeda, Norishige Fukushima:

Contrast-Aware DCT for Image Enhancement with JPEG Compatible Coding. 1-6 - Tatsuro Inaba, Kazuyoshi Yoshii, Eita Nakamura:

On the Importance of Time and Pitch Relativity for Transformer-Based Symbolic Music Generation. 1-6 - Kazuki Yamato, Satoshi Ito:

Sampling Pattern Augmentation to Enhance Deep Learning-based Image Reconstruction of MRI. 1-6 - Kuan-Hsun Ho, En-Lun Yu, Jeih-Weih Hung, Shih-Chieh Huang, Berlin Chen:

GLASS: Investigating Global and Local context Awareness in Speech Separation. 1-6 - Hiroto Sawada, Shoko Imaizumi, Hitoshi Kiya:

Enhancing Security Using Random Binary Weights in Privacy-Preserving Federated Learning. 1-6 - Aquib Iqbal, Abid Hasan Zim, Md Asaduzzaman Tonmoy, Limengnan Zhou, Asad Malik, Minoru Kuribayashi:

EAViT: External Attention Vision Transformer for Audio Classification. 1-6 - Bach-Tung Pham, Pao-Chi Chang, Jia-Ching Wang:

Seismic-ionospheric Precursor Prediction Using Deep Learning. 1-4 - Eun-bin An, Ayoung Kim, Soon-Heung Jung, Hyon-Gon Choo, Kwang-deok Seo:

Adaptive Spatial Re-sampling Method for Video Coding for Machines. 1-4 - Wenze Ren, Yi-Cheng Lin, Huang-Cheng Chou, Haibin Wu, Yi-Chiao Wu, Chi-Chun Lee, Hung-Yi Lee, Hsin-Min Wang, Yu Tsao:

EMO-Codec: An In-Depth Look at Emotion Preservation Capacity of Legacy and Neural Codec Models with Subjective and Objective Evaluations. 1-6 - Divesh Lala, Koji Inoue, Tatsuya Kawahara:

Prediction of negative user reactions towards system responses during attentive listening. 1-6 - Daishi Tanaka, Michiharu Niimi:

Detection of Diffusion-Generated Images Using Sparse Coding. 1-6 - Takumi Nagawaki, Keisuke Ikeda, Kohei Chike, Hiroyuki Nagano, Masaki Nose, Satoshi Tamura:

Targeted Representation with Information Disentanglement Encoding Networks in Tasks. 1-5 - Yoto Fujita, Aditya Arie Nugraha, Diego Di Carlo, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii:

Run-Time Adaptation of Neural Beamforming for Robust Speech Dereverberation and Denoising. 1-6 - Toshihiro Tsukagoshi, Kazuhiro Koiwai, Masafumi Nishida, Masafumi Nishimura:

SSL-based Chewing and Swallowing Detection Using Multiple Skin-contact Microphones. 1-5 - Rinka Kawano, Masaki Kawamura

:
Estimation of rotation angle and anisotropic scaling rate using pilot signals for watermarking. 1-6 - Juhwan Yoon, Hyungseob Lim, Hyeonjin Cha, Hong-Goo Kang:

StylebookTTS: Zero-Shot Text-to-Speech Leveraging Unsupervised Style Representation. 1-6 - Wageesha Manamperi, Thushara D. Abhayapala:

Successive Speaker Relative Transfer Function Estimation Through Relative Transfer Matrix in Noisy Reverberant Environments. 1-6 - Keigo Ichikawa, Sei Ueno, Akinobu Lee:

Data generation for speaker diarization by speaker transition information. 1-5 - Takehiro Imamura, Yuka Hashizume, Tomoki Toda:

Multi-Task Learning Approaches for Music Similarity Representation Learning Based on Individual Instrument Sounds. 1-6 - Kazuhiro Nakadai, Makoto Kumon, Yoko Sasaki, Kotaro Hoshiba, Benjamin Yen:

Swarm Active Audition System with Robots and Drones for a Search and Rescue Task. 1-6 - Infall Syafalni, Angelica Winasta Sinisuka, Dwi Kalam Amal Tauhid, Farrel Ahmad, Muhammad Alif Putra Yasa, Steven Alexander Wen, Erwin Setiawan, Nana Sutisna, Trio Adiono:

Exploration Robot Based On YOLOv8 Algorithm. 1-5 - Kai Guo, Xiang Xie, Fengrun Zhang:

Annotation-free Fine-tuning for Unsupervised Anomalous Sound Detection. 1-5 - Fauzan Maftuh Alwafi, Boby Mugi Pratama

, Phuong Thi Le, Bima Prihasto
, Jia-Ching Wang:
Enhanced Detection of Illegally Parked Vehicles Using YOLO and Good Feature to Track Methods. 1-6 - Mai Ohta, Hiroki Matsuura, Takeo Fujii:

A Study on Packet-Level Index Modulation Using Frequency Offsets within a LoRaWAN Channel. 1-6 - Yuki Sato, Yuya Chiba, Ryuichiro Higashinaka:

Investigating the Language Independence of Voice Activity Projection Models through Standardization of Speech Segmentation Labels. 1-6 - Ginji Ohashi, Shinsuke Ibi, Takumi Takahashi, Hisato Iwai:

Data-Driven Tuning for Weighted Least Squares of BLE-AoA-based Indoor Localization. 1-6 - Vu Hoang Dung, Nguyen Trung Kien, Do Thanh Ha:

Enhanced Sparse Convolutional Detection Model for 3D Object Detection in Autonomous Vehicles Adapted to Traffic Conditions in Vietnam. 1-6 - Trio Adiono, Clarence Amadeus, Sindy Novaria Cicilya Sinaga, Teuku Rafifsyah Thomi:

Implementation of Real-Time Oscillometric Based Algorithm for Blood Pressure Measurement in Patient Monitor. 1-6 - Trio Adiono, Rd Elviana La'salina Muhlis, Clarence Amadeus, Sindy Novaria Cicilya Sinaga:

Development of Simple Algorithm to Detect and Filter Motion Artifact Noise in Non Invasive Blood Pressure (NIBP) Measurement. 1-6 - Ryotaro Nagase, Takashi Sumiyoshi, Natsuo Yamashita, Kota Dohi, Yohei Kawaguchi:

Can We Estimate Purchase Intention Based on Zero-shot Speech Emotion Recognition? 1-6 - Aditya Raikar, Meet H. Soni, Ashish Panda, Sunil Kumar Kopparapu:

Acoustic model adaptation in noisy and reverberated scenarios using multi-task learned embeddings. 1-5 - Kanishq Singhal, Aditya Goyal, Priyanka Gupta:

Quefrency Approach to Audio Deepfake Detection. 1-6 - Zhen-Xun Lee, Jian-Jiun Ding:

PBJDT: Point-Based Joint Detection-and-Tracking. 1-6 - Hangjing Zhang, H. Vicky Zhao:

Modeling and Analysis of the Interaction between Opinions and Actions among Heterogeneous Agents. 1-6 - Hoang-Son Bui

, Sy-Hoang Tran, Thuy-Binh Nguyen, Thanh-Hai Tran, Hai Vu, Thi-Lan Le:
Marker-Aware Ovarian Tumor Segmentation from Ultrasound Images. 1-6 - Yuki Nakano, Yuting Geng, Kenta Iwai, Takanobu Nishiura:

Deep-Learning-Based Speech Enhancement with Rough-Focused Optical Laser Microphone by Reconstructing Complex Spectrum. 1-5 - Nischay Purnekar, Benedetta Tondi, Mauro Barni:

Physical Domain Adversarial Attacks Against Source Printer Image Attribution. 1-6 - Shu Komatsu, Akira Kubota:

Color Enhancement for the Colorblind Using Color Correction Intensity Map and Pix2pix Image Conversion. 1-5 - Hiya Chaudhari, Arth J. Shah, Hemant A. Patil:

Cross Lingual Speech Representation for Infant Cry Classification. 1-5 - Yi Zhang, FangYuan Liu, JiaJia Song, Qi Zeng, Hui He:

MTFNet: Multi-Scale Transformer Framework for Robust Emotion Monitoring in Group Learning Settings. 1-8 - Zezhong Jin, Youzhi Tu, Man-Wai Mak:

Joseph: phonetic-aware speaker embedding for far-field speaker verification. 1-6 - Naotaka Kawata, Shota Orihashi, Satoshi Suzuki, Tomohiro Tanaka, Mana Ihori, Naoki Makishima, Taiga Yamane, Ryo Masumura:

Block Refinement Learning for Improving Early Exit in Autoregressive ASR. 1-6 - Gaurav Hirani, Kevin I-Kai Wang, Waleed H. Abdulla:

Continual Learning with Self-Organizing Maps: A Novel Group-Based Unsupervised Sequential Training Approach. 1-6 - Ryuichi Hatakeyama, Kohei Okuda, Toru Nakashika:

DDPMVC: Non-parallel any-to-many voice conversion using diffusion encoder. 1-6 - Tianqin Zheng, Hanchen Pei, Ningning Pan, Jilu Jin, Gongping Huang, Jingdong Chen, Jacob Benesty:

A Single-Input/Binaural-Output Perceptual Rendering Based Speech Separation Method in Noisy Environments. 1-5 - Dipanita Chakraborty, Werapon Chiracharit, Kosin Chamnongthai, Minoru Okada:

Camera Focal Length Prediction for Neural Novel View Synthesis from Monocular Video. 1-5 - Jun-Seok Lee, Yun-Sung Lee, Han-Jeong Hwang:

Effect of Dynamic Binaural Beats on Concentration Enhancement. 1-4 - Koki Maruyama, Shun Sawada, Hidefumi Ohmura, Kouichi Katsurada:

Speech Synthesis from IPA Sequences through EMA Data. 1-6 - Koichi Ito, Taito Tonosaki, Takafumi Aoki, Tetsushi Ohki, Masakatsu Nishigaki

:
Multibiometrics Using a Single Face Image. 1-6 - Takumi Yamamoto, Kotaro Hoshiba, Benjamin Yen, Kazuhiro Nakadai:

Implementation of a Robot Operation System-based network for sound source localization using multiple drones. 1-6 - Arth J. Shah, Nandini V. Mandaviya, Hemant A. Patil:

Voice Liveness Detection Using Linear Frequency Residual Cepstral Coefficients. 1-6 - A. Sumarudin, Nana Sutisna, Infall Syafalni, Bambang Riyanto Trilaksono, Trio Adiono:

Optimizing Deep Q-Network for Shortest Path Computation of Mobile Robot Agents. 1-6 - Takuto Wada, Ryohei Sasaki, Katsumi Konishi:

Adaptive Subspace Clustering for Matrix Completion. 1-5 - Keiko Kawase, Gen Sato, Izumi Tsunokuni, Yusuke Ikeda

:
Data-Driven Sound Field Reproduction for Higher-Order Mode Matching Using a Circular Loudspeaker Array. 1-5 - Jianan Chen, Chenhui Chu, Sheng Li, Tatsuya Kawahara:

Data Selection using Spoken Language Identification for Low-Resource and Zero-Resource Speech Recognition. 1-6 - Beom Jun Woo, Ji Won Yoon, Min Hyun Han, Chanyeong Moon, Nam Soo Kim:

EEND-EM: End-to-End Neural Speaker Diarization with EM-Network. 1-5 - Minh Vu, Zhou Wei, Binit Bhattarai, Kah Kuan Teh, Tran Huy Dat:

VietSing: A High-quality Vietnamese Singing Voice Corpus. 1-6 - Ngoc Son Tran, Pei-Chin Hsieh, Yih-Liang Shen, Yen-Hsun Chu, Tai-Shih Chi:

Real-Time Monophonic Dual-Pitch Extraction Model. 1-6 - Wageesha N. Manamperi, Thushara D. Abhayapala:

Relative Transfer Matrix for Drone Audition Applications: Source Enhancement. 1-6 - Asfa Jamil, Alessandro Artusi:

Ablation Study to Derive a Computationally Efficient Deep Learning-Based Super-Resolution Approach. 1-6 - Yoshifumi Shoji, Masahiro Yukawa:

Robust Quantile Regression Under Unreliable Data. 1-6 - Masaya Togashi, Ingon Chanpornpakdi

, Toshihisa Tanaka:
Electroencenphalogram-Based Effective Features for Sustained Attention Assessment in Conversation. 1-6 - Ryu Takeda, Kazunori Komatani:

Scale-invariant Online Voice Activity Detection under Various Environments. 1-6 - Ritik Mahyavanshi, C. V. Mahesh Reddy, Arth J. Shah, Hemant A. Patil:

Teager Energy Cepstral Coefficients for Audio Deepfake Detection. 1-6 - Haruna Aoki, Sinan Zhang, Yumie Ono:

EEG-based Evaluation of Enjoyment Emotion during cognitive-motor task. 1-4 - Xusheng Yang, Zifeng Zhao, Yuexian Zou:

Peer Learning via Shared Speech Representation Prediction for Target Speech Separation. 1-7 - So Watanabe, Chee Siang Leow, Junichi Hoshino, Takehito Utsuro, Hiromitsu Nishizaki:

Assessment and Improvement of Customer Service Speech with Multiple Large Language Models. 1-6 - Arth J. Shah, Savita H. Yadav, Hemant A. Patil:

Teager Energy Cepstral Coefficients for Spoken Language Identification. 1-6 - Saki Nomura, Junya Hara, Hiroshi Higashi, Yuichi Tanaka:

Dynamic Sensor Placement on Graphs Based on Graph Signal Sampling Theory. 1-6 - Masora Okano, Koichi Ito, Masakatsu Nishigaki

, Tetsushi Ohki:
Enhancing Remote Adversarial Patch Attacks on Face Detectors with Tiling and Scaling. 1-6 - Haeyoung Lee, Sunhee Kim, Minhwa Chung:

Analysis of Various Self-Supervised Learning Models for Automatic Pronunciation Assessment. 1-6 - Atsuya Emoto, Ryo Matsuoka:

Hyperspectral Anomaly Detection Using Robust Principal Component Analysis with Autoencoding Adversarial Network. 1-4 - Yu-Hsien Chung, Chi-Hsuan Lu, Jung-Hui Cho, Chih-Chang Yu:

Utilizing Cross Layer Attentions for Semantic Segmentation of Small Objects. 1-6 - Jinyi Mi, Sehun Kim, Tomoki Toda:

Improved Architecture for High-resolution Piano Transcription to Efficiently Capture Acoustic Characteristics of Music Signals. 1-6 - Hao Qin, Haoran Sun, Yi Wang:

A Byte-based GPT-2 Model for Bit-flip JPEG Bitstream Restoration. 1-6 - Xiaohan Shi, Yuan Gao, Jiajun He, Jinyi Mi, Xingfeng Li

, Tomoki Toda:
A Study on Multimodal Fusion and Layer Adapter in Emotion Recognition. 1-6 - Xuan-Phuoc Nguyen, Thi-Huong Nguyen, Duc-Tan Tran

, Tien-Son Bui, Van-Toi Nguyen:
An isolated Vietnamese Sign Language Recognition method using a fusion of Heatmap and Depth information based on Convolutional Neural Networks. 1-6 - Hongil Kim, Changwoo Han, Donghyun Kim, Sung-Chang Lim, Seung-Won Jung:

Test-Time Optimization for Post-Processing of Compressed Videos. 1-6 - Yanjun Li, Xiangyu Zhao, Zhengpeng Zha, Zhenhua Ling:

ET-SSM: Linear-Time Encrypted Traffic Classification Method Based On Structured State Space Model. 1-6 - Dongfei Chang, Jijie Wu, Xiaoxu Li:

Agent Attention Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification. 1-6 - Yuan-Jhe Yin, Bo-Yu Chen, Berlin Chen:

A Novel LLM-based Two-stage Summarization Approach for Long Dialogues. 1-6 - Jin Xuan Teh, Norihiro Takamune, Hiroshi Saruwatari, Benjamin Yen, Michael Kingan, Yusuke Hioka:

Beamforming informed independent low-rank matrix analysis for sound source enhancement in unmanned aerial vehicles. 1-6 - Li-Ting Pai, Yi-Cheng Wang, Bi-Cheng Yan, Hsin-Wei Wang, Jia-Liang Lu, Chi-Han Lin, Juan-Wei Xu, Berlin Chen:

An Effective Contextualized Automatic Speech Recognition Approach Leveraging Self-Supervised Phoneme Features. 1-6 - Ximin Chen, Yuting Ding, Nan Yan, Changsheng Chen, Fei Chen:

Context-FFT: A Context Feed Forward Transformer Network for EEG-based Speech Envelope Decoding. 1-5 - Jiajin He, Chengxi Dong, Yunqi Cai, Dong Wang:

ComplexFace: A Public Visible-Thermal Face Dataset with Real-Life Complexity. 1-6 - Danqi Jin, Yitong Chen, Jie Chen, Gongping Huang:

Affine Combination of General Adaptive Filters. 1-5 - Takaaki Kojima, Norihiro Takamune, Daichi Kitamura, Hiroshi Saruwatari:

Design of Spectrogram-Consistency Regularization Term Dependent on Observation in Independent Low-Rank Matrix Analysis for Blind Source Separation. 1-6 - Jia-Yin Peng, Jian-Yi Chen, Bing-Zhao Li:

A Novel kind of WVD Associated with the Linear Canonical Transform. 1-6 - Haokun Cao, Yuanman Li, Xinyu Yang, Xia Li:

Region Aware Framework for Constrained Image Splicing Detection and Localization. 1-6 - Muhammad Sayyid Afif, Infall Syafalni, Nana Sutisna, Trio Adiono:

Transformer Attention Matrix Multiplication Design using 4 × 4 Systolic Arrays. 1-6 - Kapeleshh KS, Wei Chen, Prince Aldrin Domer, Hong Ji:

Exploring Brain Connectivity Patterns and Cognitive Resilience in Aging: A Study with the LEMON Dataset. 1-6 - Tsutahiro Fukuhara, Junya Hara, Hiroshi Higashi, Yuichi Tanaka:

Graph Filter Transfer for Time-Varying Signal Estimation Between Two Networks. 1-6 - Xianrui Wang, Shiqi Zhang, Bo He, Shoji Makino, Jingdong Chen:

Learnable Cross-Correlation based Filter-and-Sum Networks for Multi-channel Speech Separation. 1-5 - Sadahiro Yoshikawa, Ryo Ishii, Shogo Okada:

Is Corpus Suitable for Human Perception?: Quality Assessment of Voice Response Timing in Conversational Corpus through Timing Replacement. 1-6 - Seunghee Han

, Sunhee Kim, Minhwa Chung:
Developing a Multilingual Spontaneous L2 Speech Corpus for Automated Proficiency Assessment. 1-6 - Daiki Sawada, Masahiro Yukawa:

Robust Adaptive Filtering Based on Adaptive Projected Subgradient Method: Moreau Enhancement of Distance Function. 1-6 - Jingyu Ren, Lei Yang:

Enhanced RefineDNet for Single Image Dehazing. 1-6 - Veron Zhen Liang Hii, Aaron Ken Kiat Lo, Ida Pei Xin Lee, Alec Vince Gonzales Abuan, Sue Han Lee, Patrick Hang Hui Then:

Two-Way Malaysian Sign Language Communication System for Inclusive Education. 1-6 - Doyeon Kim, Yanjue Song, Nilesh Madhu, Hong-Goo Kang:

Enhancing Neural Speech Embeddings for Generative Speech Models. 1-6 - Tsubasa Naito, Ryuto Ito

, Yuichi Tanaka, Shogo Muramatsu:
Dictionary Learning for Directed Graph Signals via Augmented GFT. 1-6 - Masaki Aono, Tetsuya Asakawa, Kazuki Shimizu, Masashi Hashimoto, Takeshi Miyaji, Kei Nomura:

Detecting Coronary Artery Stenosis from Cardiac CT Images using 3D CNNs. 1-6 - Cong Hieu Le, Lam Thai Nguyen, Trung Kien Pham, Le Khanh Nguyen, Tran Hiep Dinh, Stefan Jouannic, Helene Adam, Pierre Duhamel, Nguyen Linh Trung, Trong-Minh Hoang:

Structural Analysis of Asian and African Rice Panicles via Transfer Learning. 1-8 - Wu-Hao Li, Te-Hsin Liu, Chen-Yu Chiang:

A Preliminary Study on Analysing Mandarin Tone Values of Romance L2 Mandarin Learners. 1-6 - En-Lun Yu, Ruei-Xian Chang, Jeih-Weih Hung, Shih-Chieh Huang, Berlin Chen:

COIN-AT-PVAD: A Conditional Intermediate Attention PVAD. 1-5 - Jen-Tzung Chien

, Wei-Yu Sun:
Adversarial Augmentation and Adaptation for Speech Recognition. 1-6 - Zongmei Chen, Xin Liao, Xiaoshuai Wu, Yanxiang Chen:

Compressed Deepfake Video Detection Based on 3D Spatiotemporal Trajectories. 1-8 - Rei Hamakawa, Michiharu Niimi:

Generation of target speech with speaker individuality based on accent conversion for English pronunciation learning. 1-6 - Koji Iwano

, Wakana Komuro, Manami Gomi:
Comparative Analysis of Voice Mimicry Attacks by High- and Low-Skilled Imitators on Speaker Verification Systems. 1-6 - Duc Hai Nguyen, Trong Hiep Do, Hoang Linh Phuong Nguyen, Quoc Khanh Nguyen, Duc-Tan Tran

, Tien-Son Bui, Van Toi Nguyen:
A Solution For Anomaly Detection of Red Beans In A Product Processing Line. 1-5 - Sayaka Toma, Tomoki Ariga, Yosuke Higuchi, Ichiju Hayasaka, Rie Shigyo, Tetsuji Ogawa

:
Differences Between Singer and Speaker Verification: Training Singer Feature Representation Extractor Utilizing Singing Voice Characteristics. 1-5 - Yuma Kinoshita, Hitoshi Kiya:

Scene-Segmentation-Based Exposure Compensation for Tone Mapping of High Dynamic Range Scenes. 1-6 - Shaoxiang Dang, Tetsuya Matsumoto, Yoshinori Takeuchi, Hiroaki Kudo:

U-Mamba-Net: A highly efficient Mamba-based U-net style network for noisy and reverberant speech separation. 1-5 - Huisheng Wang, Mingxiao Liu, Ji Qi

, H. Vicky Zhao:
Optimal Investment With Incomplete Information and Herd Effect. 1-6 - Ravindrakumar M. Purohit

, Dharmendra H. Vaghera, Hemant A. Patil:
GPGAN-VC: Enhancing Voice Conversion using Gradient Penalty. 1-6 - Dahyun Kim, Dongkwon Jin, Chang-Su Kim:

Monocular Depth Estimation for Autonomous Driving Based on Instance Clustering Guidance. 1-6 - Kaibao Nie:

Incorporating Auditory Processing into Undergraduate Signal Processing Courses to Enhance Student Learning. 1-5 - Chuong Hoang Vo, Truong Thanh Nhat Mai

, Chul Lee:
Cloud Removal in Hyperspectral Satellite Images Using Low-rank Tensor Completion. 1-6 - Yicheng Li

, Xinghua Sun:
One-step Spectral Estimation for Euclidean Distance Matrix Approximation. 1-6 - Zhanxuan Mei, Yun-Cheng Wang, C.-C. Jay Kuo:

GSBIQA: Green Saliency-guided Blind Image Quality Assessment Method. 1-6 - Anindhita Nayazirly Sukarno, Yahwista Salomo, Trio Adiono, Infall Syafalni, Nana Sutisna, Rahmat Mulyawan:

Accelerated Real-Time Local Maxima Detection in Video Streams Using FPGA Technology. 1-6 - Mare Hirose, Shoko Imaizumi, Hitoshi Kiya:

On the Security of Bitstream-level JPEG Encryption with Restart Markers. 1-6 - Conghui Li, Chern Hong Lim, Xin Wang:

A Parameter-free model for long-term concrete creep prediction. 1-6 - Ravindra M. Purohit

, Arth J. Shah, Hemant A. Patil:
GGMDDC: An Audio Deepfake Detection Multilingual Dataset. 1-6 - Davy Tec-Hinh Lau, Jian-Jiun Ding, Guillaume Muller:

Optimization of the Intensity Aware Loss for Dynamic Facial Expression Recognition. 1-5 - Sarah Shamina Abdul Rauf, Mas Ira Syafila Mohd Hilmi Tan, Yuen Peng Loh:

Multi-band Satellite Image Analysis for Multi-label Classification. 1-6 - Shintami Chusnul Hidayati, Muhammad Valda Rizky Nur Firdaus, Riki Wahyu Nur Dianto, Sarwosri:

Unleashing Attributes-content Adaptation with Multi-color Spaces for Food Photo Aesthetic Assessment. 1-6 - Ravindrakumar M. Purohit

, Dharmendra H. Vaghera, Arth J. Shah, Hemant A. Patil:
PPHiFi-TTS: Phonetic Preserved High-Fidelity Text-to-Speech for Long-Term Speech Dependencies. 1-6 - Yiting Zhang, Kaien Mo, Tetsuya Ueda, Yichen Yang, Shoji Makino:

On Joint Dereverberation and Single Moving Source Separation with Online Source Steering. 1-4 - Mei Hashimoto, Michiharu Niimi:

Generation of Photo Slideshow with Song based on Closeness between Concept of Lyrics and That of Images. 1-6 - Meghana Avula, Aditya Pusuluri, Hemant A. Patil:

Significance of Entropy Based Features For Dysarthric Severity Level Classification. 1-6 - Rui Zhou, Akinori Ito

, Takashi Nose:
Improving Speaker Consistency in Speech-to-Speech Translation Using Speaker Retention Unit-to-Mel Techniques. 1-6 - Meng-Shin Lin, Bi-Cheng Yan, Tien-Hong Lo, Hsin-Wei Wang, Yue-Yang He, Wei-Cheng Chao, Berlin Chen:

PG-MDD: Prompt-Guided Mispronunciation Detection and Diagnosis Leveraging Articulatory Features. 1-6 - Zhi-Wei Tan, Andy W. H. Khong:

SMoLnet-T: An Efficient Complex-spectral Mapping Speech Enhancement Approach with Frame-wise CNN and Spectral Combination Transformer for Drone Audition. 1-6 - Jing-Ming Guo, Lun-Da Yuan, Cian Huang, Yi-Chong Zeng:

Contrastive Learning Based Knowledge Distillation for Enhancing Defect Detection. 1-6 - Wataru Hatakeyama, Shinnosuke Nozaki, Ayumi Serizawa, Mizuho Yoshihira, Masahiro Fujita, Ayako Yoshimura, Tetsushi Ohki, Masakatsu Nishigaki

:
Multi-Observed Authentication: A secure and usable authentication based on multi-point observation of a single physical credential. 1-6 - Umi Syamimi, Chern Hong Lim, Lillian Yee Kiaw Wang:

IoT-based Smart Attendance System using Face Recognition and Motion Detection. 1-6 - Ming Xuan Chai, Yao Deng Fam, Quinito Norman Octaviano, Chih-Yang Pee, Lai-Kuan Wong, Mas Ira Syafila Mohd Hilmi Tan, John See

:
Improved Cassava Plant Disease Classification with Leaf Detection. 1-6 - Onhi Kato, Akira Kubota:

Zero-Shot Learning for Haze Removal Using Fusion of Near-Infrared and Color Images. 1-6 - Cuixin Yang, Rongkang Dong, Kin-Man Lam:

Efficient Adaptation for Real-World Omnidirectional Image Super-Resolution. 1-6 - Ken Kurata, Gen Sato, Izumi Tsunokuni, Yusuke Ikeda

:
Noise-Robust Estimation of Early-part Room Impulse Responses based on Physics-Informed Neural Network with Dynamic Pulling Method. 1-5 - Rongkang Dong, Cuixin Yang, Kin-Man Lam:

Text-guided Visual Prompt Tuning with Masked Images for Facial Expression Recognition. 1-6 - Xiangyu Zhao, Yanjun Li, Zhengpeng Zha, Zhenhua Ling:

MGVul: a Multi-Granularity Detection Framework for Software Vulnerability. 1-6 - Joonyong Park, Daisuke Saito, Nobuaki Minematsu:

Analytic Study of Text-Free Speech Synthesis for Raw Audio using a Self-Supervised Learning Model. 1-6 - Roshan Birjais, Kevin I-Kai Wang, Waleed H. Abdulla:

Training Deep Neural Networks with HSIC and Backpropagation. 1-5 - Nitya Tiwari, Arjun Reddy Vadyala, K. S. Nataraj:

Automated prediction of loudness growth curve using EEG signals. 1-6 - Tianyu Gong, Tao Zhang, Ye Zhong, Mengmeng Zhang, Huihui Bai:

Screen Content Encoding Network Based on Deep Contextual Information. 1-6 - Dengyong Zhang, Runqi Lou, Jiaxin Chen, Xin Liao, Gaobo Yang, Xiangling Ding:

Dual Motion Attention and Enhanced Knowledge Distillation for Video Frame Interpolation. 1-6 - Yuanchen Niu, Yuanman Li, Guijia Zhang, Xia Li:

A Diffusion-Based Approach for Restoring Face-swapped Images. 1-5 - Raffaele Disabato, AprilPyone MaungMaung, Huy H. Nguyen, Isao Echizen:

Transfer-Based Adversarial Attack Against Multimodal Models by Exploiting Perturbed Attention Region. 1-6 - Yih-Liang Shen, Tai-Shih Chi:

Ensemble learning based head-related transfer function personalization using anthropometric features. 1-6 - Jinkai Zhang, Zijuan Han, Yunxia Liu, Yang Yang:

A Multi-Domain Camera Model Identification Feature Restoration Network to Counter AI Compression Attacks. 1-6 - Hayato Takeuchi, Takao Kawamura, Nobutaka Ono, Shoko Araki:

Synchronization of Signals with Sampling Rate Offset and Missing Data Using Dynamic Programming Matching. 1-6 - Muwei Jian, Yukun Ling, Rui Wang, Yanjie Zhong, Huihui Huang, Xiaoguang Li:

RepViT Based Lightweight Architecture for Distracted Driving Detection. 1-6 - Satoshi Shoji, Wataru Yata, Keita Kume, Isao Yamada:

A Discrete-Valued Signal Estimation by Nonconvex Enhancement of SOAV with cLiGME Model. 1-6 - Yuki Nishi, Koichi Shinoda, Koji Iwano

:
LDMSE: Low Computational Cost Generative Diffusion Model for Speech Enhancement. 1-6 - Xiao Zhang, Haoran Xing, Mingxue Song, Daiki Takeuchi, Noboru Harada, Shoji Makino:

Prediction-error-based Adaptive SpecAugment for Fine-tuning the Masked Model on Audio Classification Tasks. 1-6 - Chun-Lin Liao, Jian-Jiun Ding, Chun-Jen Shih

:
Non-blind Deblurring Using Probabilistic Models and Spatial Adaptive Restoration. 1-6 - Primanda Adyatma Hafiz, Candy Olivia Mawalim, Dessi Puji Lestari, Sakriani Sakti, Masashi Unoki:

Anomalous Machine Sound Detection Based on Time Domain Gammatone Spectrogram Feature and IDNN Model. 1-6 - Jiahao Zhang, Qi Liu

, Le Hui, Yuchao Dai:
A Two-Stage Method for 3D Architecture Wireframe Reconstruction from Airborne LiDAR Point Cloud. 1-6 - Zhirun Li, Shisheng Guo, Jiahui Chen, Zhihao Zhu, Chen Qiu, Guolong Cui, Yutao Xiang:

A Two-Stage Wall Parameters Estimation Algorithm for MIMO Through-the-Wall Radar. 1-5 - Naohito Yoshikawa, Masaaki Ikehara:

Enhancing YOLOv7 with GLF-Trans for Precision in Small Object Detection. 1-5 - Shuhong Chen, Zewei Chen, Chen Li, Xianwei Zheng, Minfan He, Xutao Li:

Adaptive Time-Varying Graph Learning for Traffic Flow Data Based on Anomaly Moment Detection. 1-5 - Quang-Hai Luong, Duc-Nghia Tran, Sy-Hiep Nguyen, Lam Sinh Cong, Duc-Tan Tran:

Enhancing Shear Wave Propagation Analysis in Tissue with Directional Filtering of Reflected Waves. 1-6 - Zepeng Zhang, Ziping Zhao:

A Joint Graph Signal and Laplacian Denoising Network. 1-5 - Nichika Koyama, Nari Tanabe, Masaya Fujisawa:

Hammering Inspection System Using HPSS and Gradient Boosting with a Wall-Climbing Robot. 1-5 - Huiyun Hu, Junda Kong, Fei Wang, Hongzhi Sun, Yang Ge, Bo Xiao:

GMNER-LF: Generative Multi-modal Named Entity Recognition Based on LLM with Information Fusion. 1-6 - Zewei Chen, Shuhong Chen, Chen Li, Xianwei Zheng, Minfan He, Xutao Li:

Knowledge Augmented Attention Gating Embedding for Link Prediction. 1-5 - Seung-Won Lee, Jun-Seok Lee, Han-Jeong Hwang:

Effect of White Noise on Working Memory Using Event-Related Potentials. 1-4 - Natchira Dachoponchai, Yodchanan Wongsawat, Jetsada Arnin:

Predictive Analysis of Driver Drowsiness Progression: Multi-Level Drowsiness Classification Using Physiological Signals. 1-6 - Ryota Seo, Minoru Kuribayashi, Akinobu Ura, Antoine Mallet, Rémi Cogranne, Wojciech Mazurczyk, David Megías

:
Toward Universal Detector for Synthesized Images by Estimating Generative AI Models. 1-6 - Joshua Murphy

, Conor Rosato
, Andrew Millard, Simon Maskell:
Parameterizing Hierarchical Particle Filters with Concept Drift for Time-varying Parameter Estimation. 1-6 - Eunsoo Hong, Sunhee Kim, Minhwa Chung:

Unsupervised Discovery of Non-Categorical L2 Error Patterns Using Wav2Vec2.0 Code Vectors. 1-6 - Jiachen Qiu, Yushen Zuo, Kin-Man Lam:

ACE-Flow: Auto Color Encoding for Enhanced Low-Light Image Restoration. 1-6 - Dengyong Zhang, Chuanzhen Xu

, Jiaxin Chen, Bin Deng, Xin Liao:
YOLO-DC: Enhancing object detection with deformable convolutions and contextual mechanism. 1-6 - Yohei Horiguchi, Masaaki Ikehara, Kei Shibasaki:

More Direct and stage-wise network for Face Super Resolution. 1-6 - Z. Guo, Y. H. Chan, N. F. Law:

Deep Learning-based Intraoperative Video Analysis for Cataract Surgery Instrument Identification. 1-7 - Changsheng Chen, Wenyu Chen, Ximin Chen, Haodong Li

:
A Document Presentation Attack Detection Scheme with Optical Flow under a Flashlight. 1-6 - Han Wang, Mingrui He, Mingjun Zhang, Longting Xu:

Semi-Supervised Far-Field Speaker Verification with Distance Metric Domain Adaptation. 1-6 - Alika Choo, Arghya Pal

, Sailaja Rajanala, Arkendu Sen:
META: Text Detoxification by leveraging METAmorphic Relations and Deep Learning Methods. 1-6 - Wendi Zhu, KokSheik Wong, Minoru Kuribayashi:

A Permutation-based Reversible Data Hiding Method with Zero Visual Distortion. 1-6 - Yuan Hu, Yifan Zhang, Mingyang Ma, Shaohui Mei:

A Coarse-to-Fine Change Detection Method for Remote Sensing Sparse Cultivated Land. 1-6 - Chen-Jui Hsu, Jian-Jiun Ding, Chun-Jen Shih

:
Tsnake: A Time-Embedded Recurrent Contour-Based Instance Segmentation Model. 1-6 - Minyoung Oh, Jae-Young Sim:

Lifelong Person Re-Identification with Backward-Compatibility. 1-6 - Kaito Takahashi, Yukoh Wakabayashi, Kengo Ohta, Akio Kobayashi, Norihide Kitaoka:

Domain Adaptation by Alternating Learning of Acoustic and Linguistic Information for Japanese Deaf and Hard-of-Hearing People. 1-7 - Xiaohan Fang

, Peilin Chen
, Meng Wang
, Shiqi Wang
:
How Accurate Can Large Vision Language Model Perform for Images with Compression Degradation? 1-6 - Guojian Lin, Yu Tsao, Fei Chen:

A Non-Intrusive Speech Quality Assessment Model using Whisper and Multi-Head Attention. 1-6 - Sho Inoue, Kun Zhou, Shuai Wang, Haizhou Li:

Fine-Grained Quantitative Emotion Editing for Speech Generation. 1-6 - Tomohiro Hayashi, Riku Ogino, Kohei Saijo, Tetsuji Ogawa

:
What to Refer and How? - Exploring Handling of Auxiliary Information in Target Speaker Extraction. 1-6

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














