


default search action
MMAsia 2023: Tainan, Taiwan
- Wen-Huang Cheng, Wei-Ta Chu, Min-Chun Hu, Jiaying Liu, Munchurl Kim, Wei Zhang:

ACM Multimedia Asia 2023, MMAsia 2023, Tainan, Taiwan, December 6-8, 2023. ACM 2023
Full Papers
- Yu-Jou Chen

, Yu-Shuen Wang
:
TrackNetV3: Enhancing ShuttleCock Tracking with Augmentations and Trajectory Rectification. 1:1-1:7 - Pengju Wang

, Bochao Liu
, Dan Zeng
, Chenggang Yan
, Shiming Ge
:
Personalized Federated Learning via Backbone Self-Distillation. 2:1-2:7 - Naifu Xue

, Yuan Zhang
:
Lambda-Domain Rate Control for Neural Image Compression. 3:1-3:7 - Weijie Luo

, Zihao Liu
, Guohao Dai
, Ningyi Xu
:
History-Detr: Optimize Query Initialization Strategy by Using Historical Information and Kinematics. 4:1-4:7 - Gaohuan Dong

, Qing Xie
, Jiachen Li
, Yanchun Ma
, Yuhan Liu
, Yongjian Liu
:
A Multi-scale and Dense Object Detector for Tibetan Thangka Images. 5:1-5:7 - Yidan Fan

, Yongxin Yu
, Wenhuan Lu
, Yahong Han
:
A Cross-modal and Redundancy-reduced Network for Weakly-Supervised Audio-Visual Violence Detection. 6:1-6:7 - Siqi Zhang

, Jing Liu
, Zhihua Wei
:
From Pixels to Explanations: Uncovering the Reasoning Process in Visual Question Answering. 7:1-7:9 - Hong Chen

, Bin Huang
, Xin Wang
, Yuwei Zhou
, Wenwu Zhu
:
Global-Local GraphFormer: Towards Better Understanding of User Intentions in Sequential Recommendation. 8:1-8:7 - Jie Liu

, Qin Jiang
, Qinglin Wang
:
Guided Spatio-Temporal Learning Method for 4K Video Super-Resolution. 9:1-9:7 - Jiansong Sha

, Haoyu Zhang
, Yuchen Pan
, Guang Kou
, Xiaodong Yi
:
NeRF-IS: Explicit Neural Radiance Fields in Semantic Space. 10:1-10:7 - Qiuwen Wang

, Shuai Guo
, Haoning Wu
, Rong Xie
, Li Song
, Wenjun Zhang
:
NeRF-SDP: Efficient Generalizable Neural Radiance Field with Scene Depth Perception. 11:1-11:7 - Zhengtao Yu

, Jia Zhao
, Huiling Wang
, Chenliang Guo
, Tong Zhou
, Chongxiang Sun
:
Adaptive Fusion for Visual Question Answering: Integrating Multi-Label Classification and Similarity Matching. 12:1-12:7 - Dongyang Yu

, Yunshi Xie
, Wangpeng An
, Zhang Li
, Yufeng Yao
:
Joint Coordinate Regression and Association For Multi-Person Pose Estimation, A Pure Neural Network Approach. 13:1-13:8 - Yang Fan Chiang

, Pei-Xuan Li
, Ding-You Wu
, Hsun-Ping Hsieh
, Ching-Chung Ko
:
Exploring Feature Fusion from A Contrastive Multi-Modality Learner for Liver Cancer Diagnosis. 14:1-14:7 - Xinshun Wang

, Qiongjie Cui
, Chen Chen
, Shen Zhao
, Mengyuan Liu
:
Learning Snippet-to-Motion Progression for Skeleton-based Human Motion Prediction. 15:1-15:8 - Jianping Zhong

, Zhaobo Qi
, Weigang Zhang
, Qingming Huang
:
Semantic-Aware Dynamic Feature Selection and Fusion for Object Detection in UAV Videos. 16:1-16:7 - Xinshun Wang

, Qiongjie Cui
, Chen Chen
, Shen Zhao
, Mengyuan Liu
:
Graph-Guided MLP-Mixer for Skeleton-Based Human Motion Prediction. 17:1-17:7 - Yuqing Song

, Jinyong Cheng
:
Self-supervised anomaly detection of medical images based on dual-module discrepancy. 18:1-18:7 - Lijie Li

, Caiyue Hu
, Haitao Zhang
, Akshita Maradapu Vera Venkata Sai
:
Cross-modal Image-Recipe Retrieval via Multimodal Fusion. 19:1-19:7 - Zitan Chen

, Zhuang Qi
, Xiangxian Li
, Yuqing Wang
, Lei Meng
, Xiangxu Meng
:
Class-aware Convolution and Attentive Aggregation for Image Classification. 20:1-20:7 - Xiu Li

, Chengyu Zheng
, Jie Nie
, Ruoyu Zhang
, Xinyue Liang
, Zhiqiang Wei
:
Relevance and Irrelevance Considered Subspace Mapping Neural Networks for Remote Sensing Text-Image Retrieval. 21:1-21:7 - Haixin Wang

, Jian Yang
, Ryohei Katayama
, Michiya Matsusaki
, Tomoyuki Miyao
, Jinjia Zhou
:
NuclSeg: nuclei segmentation using semi-supervised stain deconvolution. 22:1-22:6 - Kosuke Iwama

, Ryugo Morita
, Jinjia Zhou
:
Block based Adaptive Compressive Sensing with Sampling Rate Control. 23:1-23:7 - Jie Yang

, Aihua Ke
, Bo Cai
:
Adapting Hierarchical Transformer for Scene-Level Sketch-Based Image Retrieval. 24:1-24:7 - I-Ju Hsieh

, Yo-Chung Lau
, Peng-Yuan Kao
, Shih-Ping Hung
, Yi-Ping Hung
:
Domain-Adaptive Mean Teacher for Category-Level Object Pose Estimation. 25:1-25:8 - Guangxing Wu

, Junxi Chen
, Wentao Zhang
, Ruixuan Wang
:
Feature Adaptation with CLIP for Few-shot Classification. 26:1-26:7 - Jun Li

, Yi Bin
, Jie Zou
, Jiwei Wei
, Guoqing Wang, Yang Yang
:
Cross-modal Consistency Learning with Fine-grained Fusion Network for Multimodal Fake News Detection. 27:1-27:7 - Yusheng Huang

, Zhouhan Lin
:
I2SRM: Intra- and Inter-Sample Relationship Modeling for Multimodal Information Extraction. 28:1 - Ci-Yin Zhang

, Wei-Ta Chu
:
Occlusion-Aware Manga Character Re-identification with Self-Paced Contrastive Learning. 29:1-29:7 - Yipeng Leng

, Qiangjuan Huang
, Zhiyuan Wang
, Yangyang Liu
, Haoyu Zhang
:
DiffuseGAE: Controllable and High-fidelity Image Manipulation from Disentangled Representation. 30:1-30:7 - Qiaowei Ma

, Jinghui Zhong
, Yitao Yang
, Weiheng Liu
, Ying Gao
, Wing W. Y. Ng:
A Lightweight and Efficient Model for Audio Anti-Spoofing. 31:1-31:7 - Yuxiang Wan

, Banghai Wang
, Lunke Fei
:
SOFTCUTMIX: Data Augmentation and Algorithmic Enhancements for Cross-Modality Person Re-Identification. 32:1-32:7 - Zehan Tan

, Weidong Yang
, Zhiwei Wang
:
Reimagining 3D Visual Grounding: Instance Segmentation and Transformers for Fragmented Point Cloud Scenarios. 33:1-33:7 - Peng Zhang

, Yida Chen
, Meijuan Li
, Hui Zhao
, Jianqiang Zhang
, Fuqiang Wang
, Xiaoming Wu
:
Speech Spoofing Detection Based on Graph Attention Networks with Spectral and Temporal Information. 34:1-34:7 - Jingyi Cao

, Bo Liu
, Yunqian Wen
, Rong Xie
, Li Song
:
Achieving Privacy-Preserving Multi-View Consistency with Advanced 3D-Aware Face De-identification. 35:1-35:7 - Suzanne Kobeisse

, Lars Erik Holmquist
:
Moving Inside the Box: Interacting with Interpretation of Historical Artefacts Through Tangible Augmented Reality. 36:1-36:7 - Songhui Zhao

, Sujuan Hou
, Baisong Zhang
:
A Decoupled Cross-layer Fusion Network with Bidirectional Guidance for Detecting Small Logos. 37:1-37:8 - Guangtong Zhang

, Qihua Liang
, Ning Li
, Zhiyi Mo
, Bineng Zhong
:
Robust Tracking via Unifying Pretrain-Finetuning and Visual Prompt Tuning. 38:1-38:7 - Jie-Ying Li

, Herman Prawiro
, Chia-Chen Chiang
, Hsin-Yu Chang
, Tse-Yu Pan
, Chih-Tsun Huang
, Min-Chun Hu
:
Efficient Hand Gesture Recognition using Multi-Task Multi-Modal Learning and Self-Distillation. 39:1-39:7 - Takumi Nishiyasu

, Wataru Shimoda
, Yoichi Sato
:
Image Cropping under Design Constraints. 40:1-40:7 - Lin Wang

, Hongyi Zhang
, Xingfu Wang
, Yan Xiong
:
Learning a Contextualized Multimodal Embedding for Zero-shot Cooking Video Caption Generation. 41:1-41:8 - Jingwen Cui

, Qian Huang
, Chang Li
, Yunfei Zhang
:
MA-Net: Multi-Attention Network for Skeleton-Based Action Recognition. 42:1-42:7 - Zhenglin Tang

, Hai-Miao Hu
:
A Spatial-Spectral Decoupling Fusion Framework for Visible and Near-Infrared Images. 43:1-43:7 - Huaizhuo Liu

, Hai-Miao Hu
:
From Global to Local: An Adaptive Environmental Illumination Estimation for Non-uniform Scattering. 44:1-44:7 - Wei Guo

, Hao Wang
:
Key Parts Spatio-Temporal Learning for Video Person Re-identification. 45:1-45:6 - Zhongtao Chen

, Yuma Honbu
, Keiji Yanai
:
Mask-based Food Image Synthesis with Cross-Modal Recipe Embeddings. 46:1-46:7 - Yuki Matsuura

, Takahiro Hayashi
:
AniCropify: Image Matting for Anime-Style Illustration. 47:1-47:7 - Fei Zhu

, Wanqian Zhang
, Dayan Wu
, Lin Wang
, Bo Li
, Weiping Wang
:
Targeted Transferable Attack against Deep Hashing Retrieval. 48:1-48:7 - Zhewen Deng

, Dongyue Chen
, Shizhuo Deng
:
Prior Knowledge Guided Network for Video Anomaly Detection. 49:1-49:7 - Peng Liu

, Chuanxu Wang
, Jianwei Qin
, Guocheng Lin
:
Feature Enhancement and Foreground-Background Separation for Weakly Supervised Temporal Action Localization. 50:1-50:7 - Longfei Ma

, Honggang Zhao
, Zheng Jiang
, Mingyong Li
:
Multi-view-enhanced modal fusion hashing for Unsupervised cross-modal retrieval. 51:1-51:7 - Weiliang Xie

, Qian Huang
, Chang Li
, Yanfang Wang
, Yanwei Liu
:
Hierarchical Multi-Scale Adaptive Conv-LSTM Network for Human Action Recognition Based on Wearable Sensors. 52:1-52:8 - Po-Han Huang

, Yue-Hua Han
, Ernie Chu
, Jun-Cheng Chen
, Kai-Lung Hua
:
Multi-Task Self-Blended Images for Face Forgery Detection. 53:1-53:7 - Zichen Zhu

, Stefano Petrangeli
, Viswanathan Swaminathan
, Sheng Wei
:
Power Efficient Mobile VTuber Live Streaming. 54:1-54:7 - Quang Long Nguyen

, Duc Nguyen
, Huong Thu Truong
:
Toward Optimal Real-time Dynamic Point Cloud Streaming over Bandwidth-constrained Networks. 55:1-55:7 - Jialiang Shi

, Takahiro Komamizu
, Keisuke Doman
, Haruya Kyutoku
, Ichiro Ide
:
RecipeMeta: Metapath-enhanced Recipe Recommendation on Heterogeneous Recipe Network. 56:1-56:7 - Shifeng Xie

, Yi Liu
, Wenjing Shuai
:
FTUnet: Feature Transferred U-Net For Single HDR Image Reconstruction. 57:1-57:7 - Bin Zheng

, He Zhang
, Lu Jin
:
Research on Multi-Person Pose Estimation Based on YOLO and Decoupled Multi-Level Feature Layers Fusion. 58:1-58:7 - Yi Zheng

, Zuqiang Meng
:
Towards Representation Alignment and Uniformity in Long-tailed Classification. 59:1-59:7 - Shangwang Liu

, Danyang Liu
, Yinghai Lin
, Ziqi Wei
:
SFNet: Saliency fast Fourier convolutional Network for medical image segmentation. 60:1-60:7 - Peng-Fei Zhang

, Zi Helen Huang
:
Multi-head Siamese Prototype Learning against both Data and Label Corruption. 61:1-61:7 - Shengli Zhang

, Shikui Wei
, Shiyin Zhang
, Sen Xu
, Weiyan Xu
, Yao Zhao
:
Rethinking Parking Slot Detection with Rotated Bounding Box. 62:1-62:7 - Yiming Huang

, Aozhe Jia
, Xiaodan Zhang
, Jiawei Zhang
:
Generic Attention-model Explainability by Weighted Relevance Accumulation. 63:1-63:7 - Miaomiao Dai

, Hao Yin
, Ran Yi
, Lizhuang Ma
:
Geometric Style Transfer for Face Portraits. 64:1-64:7 - Huashan Sun

, Qian Huang
, Yiming Wang
, Xiaotong Guo
, Ruoyu Hao
:
Optical Flow based Feature Prediction and Decomposed Context for Video Compression. 65:1-65:7 - Yaqun Fang

, Ruichao Hou
, Jia Bei
, Tongwei Ren
, Gangshan Wu
:
ADNet: An Asymmetric Dual-Stream Network for RGB-T Salient Object Detection. 66:1-66:7 - Boyue Xu

, Yi Xu
, Ruichao Hou
, Jia Bei
, Tongwei Ren
, Gangshan Wu
:
RGB-D Tracking via Hierarchical Modality Aggregation and Distribution Network. 67:1-67:7 - Yun Liang

, Shijie Peng
, Xinjie Xiao
, Lianghui Li
:
Dual-domain Feature Learning and Cross Dimension Interaction Attention for Nighttime Image Dehazing. 68:1-68:7 - Ping-Chen Chan

, Po-Wei Chen
, Von-Wun Soo
:
Improve Singing Quality Prediction Using Self-supervised Transfer Learning and Human Perception Feedback. 69:1-69:7 - Xiaotong Guo

, Qian Huang
, Yiming Wang
, Huashan Sun
:
End-to-End Variable-Rate Image Compression with Bi-Resolution Spatial-Channel Context Aggregation. 70:1-70:7 - Yun Liang

, Ming Junhui
, Jintu Zheng
:
SASSM: Semantic Awareness and Self-Support Matching for Semi-Supervised Video Object Segmentation. 71:1-71:7 - Yun Liang

, Fumian Long
, Qiaoqiao Li
, Dong Wang
:
GTTrack: Gaussian Transformer Tracker for Visual Tracking. 72:1-72:7 - Iuan Kai Fang

, Bo-Hao Zhang
, Te Lun Liu
, Hao Tan
, Wei Syun Chen
, Che-Rung Lee
:
MontageNet: Annotated Dataset of Furniture Components in Real-World Images. 73:1-73:7 - Avinash Anand

, Raj Jaiswal
, Mohit Gupta
, Siddhesh S. Bangar
, Pijush Bhuyan
, Naman Lal
, Rajeev Singh
, Ritika Jha
, Rajiv Ratn Shah
, Shin'ichi Satoh
:
RanLayNet: A Dataset for Document Layout Detection used for Domain Adaptation and Generalization. 74:1-74:6 - Ft Zheng

, Le Hui
, Jin Xie
, Haofeng Zhang
:
Multi-Scale Superpoint Network for 3D Point Cloud Semantic Segmentation. 75:1-75:7 - Zhe Chen

, Jiyi Li
, Fumiyo Fukumoto
, Peng Liu
, Yoshimi Suzuki
:
Vision-Language Navigation for Quadcopters with Conditional Transformer and Prompt-based Text Rephraser. 76:1-76:7 - Jiajie Lin

, Zhuopan Yang
, Zhenguo Yang
, Xiaoping Li
, Fu Lee Wang
, Wenyin Liu
:
Confidence-guided Boundary Adaption Network for Multimodal Fake News Detection. 77:1-77:7 - Satayu Parinayok

, Yoko Yamakata
, Kiyoharu Aizawa
:
Open-Vocabulary Segmentation Approach for Transformer-Based Food Nutrient Estimation. 78:1-78:7 - Lijuan Zhou

, Jianing Mao
:
Improving Class Representation for Zero-Shot Action Recognition. 79:1-79:7 - Yan Niu

, Lixue Zhang
, Chenlai Li
:
Independent and Collaborative Demosaicking Neural Networks. 80:1-80:7 - Xinyi Yuan

, Liansheng Zhuang
:
Learning a Robust Model with Pseudo Boundaries for Noisy Temporal Action Localization. 81:1-81:7 - Sung Kwon On

, Songhyon Kim
, Kwangjin Yang
, Younggun Lee
:
Monocular 3D Pose Estimation of Very Small Airplane in the Air. 82:1-82:7 - Sheng Yan

, Yang Liu
, Haoqiang Wang
, Xin Du
, Mengyuan Liu
, Hong Liu
:
Cross-Modal Retrieval for Motion and Text via DropTriple Loss. 83:1-83:7 - Hamed Alimohammadzadeh

, Heather Culbertson
, Shahram Ghandeharizadeh
:
An Evaluation of Decentralized Group Formation Techniques for Flying Light Specks. 84:1-84:7
Short Papers
- Ying Shen

, Wei Li
, Zhaoquan Yuan
, Xiao Wu
:
Learning Surface-awareness Network for X-Ray Prohibited Item Detection. 85:1-85:5 - Mingjin Wu

, Shijun Xiang
:
An Efficient CNN-based Prediction for Reversible Data Hiding. 86:1-86:5 - Kuo-Yu Liu

, Yuanshan Chen
, Ming-Fang Lin
, Li-Jung Daphne Huang
, Cheah Ping Xiang
:
Developing a VR-based contextualized language learning system to Enhance Junior High School Students' Pragmatic Competence. 87:1-87:5 - Shunta Sakaue

, Taiju Kimura
, Hiroki Nishino
:
Reducing Objective Difficulty Without Influencing Subjective Difficulty in a Video Game. 88:1-88:5 - Keita Suzuki

, Satoshi Suzuki
, Ryo Masumura
, Atsushi Ando
, Naoki Makishima
:
Multi-region CNN-Transformer for Micro-gesture Recognition in Face and Upper Body. 89:1-89:5 - Ryota Kaji

, Keiji Yanai
:
VQ-VDM: Video Diffusion Models with 3D VQGAN. 90:1-90:5 - Xianhao Chen

, Kuan Chen
, Yuzhe Mao
, Linna Zhou
, Weike You
:
Facial Parameter Splicing: A Novel Approach to Efficient Talking Face Generation. 91:1-91:5 - Nouf Alrasheed

, Shraboni Sarker
, Viviana Grieco
, Praveen Rao
:
Few-Shot Learning for Word Recognition in Handwritten Seventeenth-Century Spanish American Notary Records. 92:1-92:5 - Xiaojiao Chen

, Sheng Li
, Jiyi Li
, Hao Huang
, Yang Cao
, Liang He
:
Reprogramming Self-supervised Learning-based Speech Representations for Speaker Anonymization. 93:1-93:5 - Xiaojiao Chen

, Sheng Li
, Jiyi Li
, Yang Cao
, Hao Huang
, Liang He
:
GhostVec: A New Threat to Speaker Privacy of End-to-End Speech Recognition System. 94:1-94:5 - Karanvir Singh

, Mukesh Saini
:
Towards Digital Twin of Crops for Growth Modelling using Virtual Reality. 95:1-95:5 - Jeonguk Hong

, Gyewon Jeon
, Sangwon Lee
:
Exploring User-oriented Social Recommendation System through Granting Users Control over a Social Group. 96:1-96:5 - Taiwei Wu

, Jianhao Zhang
, Lian Duan, Yuanzhe Cai
:
Music-Graph2Vec: An Efficient Method for Embedding Pitch Segment. 97:1-97:5 - Luyang Liu

, Hiroki Nishikawa
, Jinjia Zhou
, Ittetsu Taniguchi
, Takao Onoye
:
Adaptive Sampling for Computer Vision-Oriented Compressive Sensing. 98:1-98:5 - Yan Li

, Shibin Wang
:
EmAGAN: Embedded Blocks Search and Mask Attention GAN for Makeup Transfer. 99:1-99:5 - Jingbin Xu

, Junwen Chen
, Keiji Yanai
:
Contextual Associated Triplet Queries for Panoptic Scene Graph Generation. 100:1-100:5 - Liangyu Wang

, Yoko Yamakata
, Kiyoharu Aizawa
:
Automatic Dataset Creation from User-generated Recipes for Ingredient-centric Food Image Analysis. 101:1-101:5
Demo Papers
- Yen-Pin Cheng

, Tsung-Hsun Tsai
, Tai-Chen Tsai
, Yi-Hsuan Chiu
, Hung-Kuo Chu
, Min-Chun Hu
:
OmniScorer: Real-Time Shot Spot Analysis for Court View Basketball Videos. 102:1-102:3 - Chen-Wei Fu

, Wei-Lun Huang
, Pin-Xuan Liu
, Yu-Hsuan Chen
, Ming-Cong Su
, Andrew Chen
, Ping-Hsuan Han
, Tse-Yu Pan
:
TelEmoScatter: Enabling Remote Interaction and Emotional Connections in Virtual and Physical Music Performance. 103:1-103:3 - Wenlong Du

, Qingquan Li
, Jian Zhou
, Xu Ding
, Xuewei Wang
, Zhongjun Zhou
, Jin Liu
:
FinGuard: A Multimodal AIGC Guardrail in Financial Scenarios. 104:1-104:3 - Fan Yu

, Huanyu Xing
, Jia Bei
, Tongwei Ren
:
Easy Travelogue: A Travelogue Editor with Automatic Image Recommendation and Insertion. 105:1-105:3 - Shota Okubo

, Tomoaki Konno
, Toshiharu Horiuchi
, Tatsuya Kobayashi
:
Directional Sound Source Representation Using Paired Microphone Array with Different Characteristics Suitable for Volumetric Video Capture. 106:1-106:3 - Guan-Yu Wu

, Chun-Ho Hung
, Hsuan-Wei Chen
, Wei-Ta Chu
:
A Trajectory-based Statistics and Tactics Analysis System for Table Tennis. 107:1-107:3 - Ryo Kawai

, Noboru Yoshida
, Jianquan Liu
:
A consulting system for guiding various image recognitions. 108:1-108:3 - Yiyun Zhang

, Zijian Wang
:
VLM-BCD: Unsupervised Building Change Detection. 109:1-109:3 - Yu-Hsi Chen

:
One-Epoch Training for Object Detection in Fisheye Images. 110:1-110:5 - Chih-Chung Hsu

, Wen-Hai Tseng
, Ming-Hsuan Wu
, Chia-Ming Lee
, Wei-Hao Huang
:
Adapting Object Detection to Fisheye Cameras: A Knowledge Distillation with Semi-Pseudo-Label Approach. 111:1-111:6 - Yi-Zeng Hsieh

, Hau-Ching Chen
, Yi-Hung Yeh
:
Object Detection via Fisheye Camera. 112:1-112:7 - Yu-Shu Ni

, Chia-Chi Tsai
, Jyun-Syu Lin
, Hsien-Po Meng
, Po-Chi Hu
, Jiun-Shiung Chen
, Kun-Hung Lin
, Chih-Yuan Chuang
, Jiun-In Guo
:
Summary of the 2023 PAIR-LITEON Competition: Embedded AI Object Detection Model Design Contest on Fish-eye Around-view Cameras. 113:1-113:7

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














