


default search action
ICMR 2021: Taipei, Taiwan
- Wen-Huang Cheng, Mohan S. Kankanhalli, Meng Wang, Wei-Ta Chu, Jiaying Liu, Marcel Worring:

ICMR '21: International Conference on Multimedia Retrieval, Taipei, Taiwan, August 21-24, 2021. ACM 2021, ISBN 978-1-4503-8463-6
Full Research Papers
- Evlampios Apostolidis

, Eleni Adamantidou, Vasileios Mezaris, Ioannis Patras:
Combining Adversarial and Reinforcement Learning for Video Thumbnail Selection. 1-9 - Petra Budíková, Jan Sedmidubský

, Pavel Zezula:
Efficient Indexing of 3D Human Motions. 10-18 - Jie Cao, Shengsheng Qian, Huaiwen Zhang, Quan Fang, Changsheng Xu:

Global Relation-Aware Attention Network for Image-Text Retrieval. 19-28 - Pei-Chun Chang, Yong-Sheng Chen, Chang-Hsing Lee:

MS-SincResNet: Joint Learning of 1D and 2D Kernels Using Multi-scale SincNet and ResNet for Music Genre Classification. 29-36 - Xu Chen

, Lei Wu, Minggang He, Lei Meng, Xiangxu Meng:
MLFont: Few-Shot Chinese Font Generation via Deep Meta-Learning. 37-45 - Yiu-Ming Cheung, Mengke Li, Rong Zou:

Facial Structure Guided GAN for Identity-preserved Face Image De-occlusion. 46-54 - Feifei Dai, Xiaoyan Gu

, Zhuo Wang, Mingda Qian, Bo Li
, Weiping Wang
:
Heterogeneous Side Information-based Iterative Guidance Model for Recommendation. 55-63 - Feng Dai, Hao Liu

, Yike Ma, Xi Zhang, Qiang Zhao:
Dense Scale Network for Crowd Counting. 64-72 - Yujuan Ding, Yunshan Ma

, Wai Keung Wong
, Tat-Seng Chua:
Leveraging Two Types of Global Graph for Sequential Fashion Recommendation. 73-81 - Yu Duan, Yun Xiong, Yao Zhang, Yuwei Fu, Yangyong Zhu:

HSGMP: Heterogeneous Scene Graph Message Passing for Cross-modal Retrieval. 82-91 - Cheikh Brahim El Vaigh, Noa Garcia, Benjamin Renoust, Chenhui Chu, Yuta Nakashima, Hajime Nagahara:

GCNBoost: Artwork Classification by Label Propagation through a Knowledge Graph. 92-100 - Yuqian Fu, Yanwei Fu

, Yu-Gang Jiang:
Can Action be Imitated? Learn to Reconstruct and Transfer Human Dynamics from Videos. 101-109 - Ziwang Fu

, Feng Liu
, Jiahao Zhang
, Hanyang Wang, Chengyi Yang, Qing Xu, Jiayin Qi, Xiangling Fu, Aimin Zhou:
SAGN: Semantic Adaptive Graph Network for Skeleton-Based Human Action Recognition. 110-117 - Liying Gao, Kai Niu, Zehong Ma

, Bingliang Jiao, Tonghao Tan, Peng Wang:
Text-Guided Visual Feature Refinement for Text-Based Person Search. 118-126 - Yuhui Guo

, Xun Liang
:
RGB-D Scene Recognition based on Object-Scene Relation and Semantics-Preserving Attention. 127-134 - Xiaoshuai Hao, Yucan Zhou, Dayan Wu, Wanqian Zhang, Bo Li

, Weiping Wang
:
Multi-Feature Graph Attention Network for Cross-Modal Video-Text Retrieval. 135-143 - Bin Ji, Chen Yang, Shunyu Yao, Ye Pan:

HPOF: 3D Human Pose Recovery from Monocular Video with Optical Flow. 144-154 - Giorgos Kordopatis-Zilos

, Panagiotis Galopoulos
, Symeon Papadopoulos, Ioannis Kompatsiaris:
Leveraging EfficientNet and Contrastive Learning for Accurate Global-scale Location Estimation. 155-163 - Fangtao Li, Ting Bai, Chenyu Cao, Zihe Liu, Chenghao Yan, Bin Wu:

Relation-aware Hierarchical Attention Framework for Video Question Answering. 164-172 - Jiao Li, Jialiang Sun, Xing Xu, Wei Yu, Fumin Shen:

Cross-Modal Image-Recipe Retrieval via Intra- and Inter-Modality Hybrid Fusion. 173-182 - Mingyong Li

, Hongya Wang:
Unsupervised Deep Cross-Modal Hashing by Knowledge Distillation for Large-scale Cross-modal Retrieval. 183-191 - Qinghua Li, Xue Zhang, Cuiping Li, Hong Chen:

A Unified-Model via Block Coordinate Descent for Learning the Importance of Filter. 192-200 - Guoqiang Liang, Shiyu Ji, Yanning Zhang:

Local-enhanced Interaction for Temporal Moment Localization. 201-209 - Zhiguang Liu, Liangwei Wang, Jian Qiao:

Reading Scene Text by Fusing Visual Attention with Semantic Representations. 210-218 - Jia Long, Hongtao Lu:

Generative Adversarial Networks with Bi-directional Normalization for Semantic Image Synthesis. 219-226 - Junda Lu

, Mingyang Chen
, Yifang Sun, Wei Wang, Yi Wang
, Xiaochun Yang:
A Smart Adversarial Attack on Deep Hashing Based Image Retrieval. 227-235 - Sanbi Luo, Tao Guo:

Image-to-Image Transfer Makes Chaos to Order. 236-243 - Yu-Shu Ni, Chia-Chi Tsai, Jiun-In Guo, Jenq-Neng Hwang, Bo-Xun Wu, Po-Chi Hu, Ted T. Kuo, Po-Yu Chen, Hsien-Kai Kuo:

Summary of the 2021 Embedded Deep Learning Object Detection Model Compression Competition for Traffic in Asian Countries. 244-249 - Cheng Qiu, Yirong Yao, Yuntao Du

:
Nested Dense Attention Network for Single Image Super-Resolution. 250-258 - Yifan Ren, Xing Xu, Fumin Shen, Zheng Wang, Yang Yang, Heng Tao Shen:

Multi-scale Dynamic Network for Temporal Action Detection. 267-275 - Zikai Song, Zhiwen Wan, Wei Yuan, Ying Tang, Junqing Yu, Yi-Ping Phoebe Chen

:
Distractor-Aware Tracker with a Domain-Special Optimized Benchmark for Soccer Player Tracking. 276-284 - Kimihiro Tanaka, Yusuke Matsui, Shin'ichi Satoh:

Efficient Nearest Neighbor Search by Removing Anti-hub. 285-293 - Lucas Pascotti Valem, Daniel Carlos Guimarães Pedronette:

A Denoising Convolutional Neural Network for Self-Supervised Rank Effectiveness Estimation on Image Retrieval. 294-302 - Shaoying Wang, Hanjiang Lai

, Zhenyu Shi:
Know Yourself and Know Others: Efficient Common Representation Learning for Few-shot Cross-modal Retrieval. 303-311 - Xiaomei Wang, Lin Ma, Yanwei Fu

, Xiangyang Xue:
Neural Symbolic Representation Learning for Image Captioning. 312-321 - Yangtao Wang

, Yanzhao Xie, Yu Liu
, Lisheng Fan:
G-CAM: Graph Convolution Network Based Class Activation Mapping for Multi-label Image Recognition. 322-330 - Lei Wu, Xueliang Liu, Yanbin Hao, Yunjie Ma, Richang Hong:

NASTER: Non-local Attentional Scene Text Recognizer. 331-338 - Ting-Ting Xie, Christos Tzelepis

, Fan Fu, Ioannis Patras:
Few-Shot Action Localization without Knowing Boundaries. 339-348 - Baoming Yan, Qingheng Zhang, Liyu Chen, Lin Wang, Leihao Pei, Jiang Yang

, Enyun Yu, Xiaobo Li, Binqiang Zhao:
Learning Hierarchical Visual-Semantic Representation with Phrase Alignment. 349-357 - Chenghao Yan, Zihe Liu, Fangtao Li, Chenyu Cao, Zheng Wang, Bin Wu:

Social Relation Analysis from Videos via Multi-entity Reasoning. 358-366 - Kun Yan, Zied Bouraoui, Ping Wang, Shoaib Jameel, Steven Schockaert:

Aligning Visual Prototypes with BERT Embeddings for Few-Shot Learning. 367-375 - Honglei Yao, Yu-Wei Zhan, Zhen-Duo Chen, Xin Luo, Xin-Shun Xu:

TEACH: Attention-Aware Deep Cross-Modal Hashing. 376-384 - Min Zhang, Meng Ma, Ping Wang:

Scene Text Recognition with Cascade Attention Network. 385-393 - Wen Zhang, Jie Shao:

Multi-Attention Audio-Visual Fusion Network for Audio Spatialization. 394-401 - Feng Zhao, Donglin Wang, Xintao Xiang:

Multi-Initialization Graph Meta-Learning for Node Classification. 402-410 - Xinzhe Zhou, Yadong Mu:

Question-Guided Semantic Dual-Graph Visual Reasoning with Novel Answers. 411-419 - Nan Zhuang, Yadong Mu:

Joint Hand-Object Pose Estimation with Differentiably-Learned Physical Contact Point Analysis. 420-428 - Zifeng Zhuang, Xintao Xiang, Siteng Huang

, Donglin Wang:
HINFShot: A Challenge Dataset for Few-Shot Node Classification in Heterogeneous Information Network. 429-436
Short Research Papers
- Marco Cagrandi, Marcella Cornia, Matteo Stefanini, Lorenzo Baraldi

, Rita Cucchiara:
Learning to Select: A Fully Attentive Approach for Novel Object Captioning. 437-441 - Yu-Chen Chang, Wen-Cheng Chen, Min-Chun Hu:

Semi-supervised Many-to-many Music Timbre Transfer. 442-446 - Yan-He Chen, Mei-Chen Yeh:

Text-Enhanced Attribute-Based Attention for Generalized Zero-Shot Fine-Grained Image Classification. 447-450 - Konstantinos Gkountakos

, Despoina Touska
, Konstantinos Ioannidis, Theodora Tsikrika, Stefanos Vrochidis
, Ioannis Kompatsiaris:
Spatio-Temporal Activity Detection and Recognition in Untrimmed Surveillance Videos. 451-455 - Haifan Gong, Guanqi Chen, Sishuo Liu, Yizhou Yu, Guanbin Li:

Cross-Modal Self-Attention with Multi-Task Pre-Training for Medical Visual Question Answering. 456-460 - Shintami Chusnul Hidayati, Yeni Anistyasari

:
Body Shape Calculator: Understanding the Type of Body Shapes from Anthropometric Measurements. 461-465 - Hussain Kanafani, Junaid Ahmed Ghauri, Sherzod Hakimov

, Ralph Ewerth:
Unsupervised Video Summarization via Multi-source Features. 466-470 - Tarun Krishna, Kevin McGuinness

, Noel E. O'Connor
:
Evaluating Contrastive Models for Instance-based Image Retrieval. 471-475 - Xiaocheng Lu, Yuan Yuan, Qi Wang:

AWFA-LPD: Adaptive Weight Feature Aggregation for Multi-frame License Plate Detection. 476-480 - Zekun Luo, Zheng Fang, Sixiao Zheng, Yabiao Wang

, Yanwei Fu
:
NMS-Loss: Learning with Non-Maximum Suppression for Crowded Pedestrian Detection. 481-485 - Bowen Wang

, Liangzhi Li
, Yuta Nakashima, Takehiro Yamamoto, Hiroaki Ohshima
, Yoshiyuki Shoji, Kenro Aihara, Noriko Kando
:
Image Retrieval by Hierarchy-aware Deep Hashing Based on Multi-task Learning. 486-490 - Lan Yan, Wenbo Zheng, Fei-Yue Wang, Chao Gou

:
Weakly Supervised Sketch Based Person Search. 491-495 - An-Zi Yen, Chia-Chung Chang, Hen-Hsen Huang

, Hsin-Hsi Chen:
Personal Knowledge Base Construction from Multimodal Data. 496-500 - Kang Yuan, Sheng Li:

2.5D Pose Guided Human Image Generation. 501-505 - Min Zhu, Weifeng Liu, Kai Zhang, Ye Li, Peng Liu, Baodi Liu:

Collaborative Representation for Deep Meta Metric Learning. 506-510
Brave New Idea
- An-Zi Yen, Hen-Hsen Huang

, Hsin-Hsi Chen:
Ten Questions in Lifelog Mining and Information Recall. 511-518
Challenge Papers
- Yongkun Du, Zhineng Chen, Caiyan Jia, Xuanya Li, Yu-Gang Jiang:

Bag of Tricks for Building an Accurate and Slim Object Detector for Embedded Applications. 519-525 - Chih-Chung Hsu

, Chieh Lee, Lin Chen, Min-Kai Hung, Andy Yu-Lun Lin, Xian-Yu Wang:
Efficient-ROD: Efficient Radar Object Detection based on Densely Connected Residual Network. 526-532 - Bo Ju

, Wei Yang, Jinrang Jia
, Xiaoqing Ye, Qu Chen, Xiao Tan, Hao Sun, Yifeng Shi, Errui Ding:
DANet: Dimension Apart Network for Radar Object Detection. 533-539 - Bao-Hong Lai, Hsun-Ping Hsieh:

Object Detection on Embedded Systems for Traffic in Asian Countries. 540-544 - Pengliang Sun, Xuetong Niu, Pengfei Sun

, Kele Xu
:
Squeeze-and-Excitation network-Based Radar Object Detection With Weighted Location Fusion. 545-552 - Yizhou Wang, Jenq-Neng Hwang, Gaoang Wang, Hui Liu, Kwang-Ju Kim

, Hung-Min Hsu, Jiarui Cai, Haotian Zhang, Zhongyu Jiang, Renshu Gu:
ROD2021 Challenge: A Summary for Radar Object Detection Challenge for Autonomous Driving Applications. 553-559 - Wen-Kai Wu, Chien-Yu Chen, Jiann-Shu Lee:

Embedded YOLO: Faster and Lighter Object Detection. 560-565 - Jun Yu, Xinlong Hao, Xinjian Gao, Qiang Sun, Yuyu Liu, Peng Chang, Zhong Zhang, Fang Gao, Feng Shuang:

Radar Object Detection Using Data Merging, Enhancement and Fusion. 566-572 - Zangwei Zheng

, Xiangyu Yue, Kurt Keutzer, Alberto L. Sangiovanni-Vincentelli:
Scene-aware Learning Network for Radar Object Detection. 573-579
Conflict of Interest Papers
- Jia-Hong Huang, Luka Murn, Marta Mrak

, Marcel Worring
:
GPT2MVS: Generative Pre-trained Transformer-2 for Multi-modal Video Summarization. 580-589 - Omar Shahbaz Khan

, Björn Þór Jónsson, Jan Zahálka
, Stevan Rudinac, Marcel Worring
:
Impact of Interaction Strategies on User Relevance Feedback. 590-598
Demonstrations
- Ting-Hsuan Chou, Wei-Ta Chu

:
Automatic Baseball Pitch Overlay. 599-602 - Yuko Iinuma, Shin'ichi Satoh:

Video Action Retrieval Using Action Recognition Model. 603-606 - Mitchell Lee, Praveena Avula, Min Chen:

MeTILDA: Platform for Melodic Transcription in Language Documentation and Application. 607-610 - Rintaro Yanagi, Ren Togo, Takahiro Ogawa

, Miki Haseyama:
IR Questioner: QA-based Interactive Retrieval System. 611-614
Reproducibility Paper
- Yunshan Ma, Yujuan Ding, Xun Yang, Lizi Liao

, Wai Keung Wong
, Tat-Seng Chua, Jinyoung Moon, Hong-Han Shuai:
Reproducibility Companion Paper: Knowledge Enhanced Neural Fashion Trend Forecasting. 615-618
Doctoral Consortium
- Fityanul Akhyar

, Chih-Yang Lin
, Gugan S. Kathiresan
:
A Beneficial Dual Transformation Approach for Deep Learning Networks Used in Steel Surface Defect Detection. 619-622 - Ka-Hou Chan

, Sio Kei Im
:
Discrete Tchebichef Transform for Versatile Video Coding. 623-626 - Mohammad Shahid

, Kai-Lung Hua
:
Fire Detection using Transformer Network. 627-630
Special Session Paper
- Huangpeng Dai, Qing Xie, Jiachen Li, Yanchun Ma, Lin Li, Yongjian Liu:

Visible-infrared Person Re-identification with Human Body Parts Assistance. 631-637 - Zilong Fu, Hongtao Xie, Guoqing Jin, Junbo Guo:

Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition. 638-644 - Jia-Hong Huang, Ting-Wei Wu, Marcel Worring

:
Contextualized Keyword Representations for Multi-modal Retinal Image Captioning. 645-652 - Huibing Wang, Guangqi Jiang, Jinjia Peng, Xianping Fu:

MSAV: An Unified Framework for Multi-view Subspace Analysis with View Consistence. 653-659 - Jian Wang, Xian-Hua Han, Lanfen Lin, Hongjie Hu, Yen-Wei Chen:

A Tensor Sparse Representation-Based CBMIR System for Computer-Aided Diagnosis of Focal Liver Lesions and its Pilot Trial. 660-666 - Yingying Xu, Jing Liu

, Lanfen Lin, Hongjie Hu, Ruofeng Tong, Jingsong Li
, Yen-Wei Chen:
M-DFNet: Multi-phase Discriminative Feature Network for Retrieval of Focal Liver Lesions. 667-673 - Chengyuan Zhang, Zhi Zhong, Lei Zhu, Shichao Zhang, Da Cao, Jianfeng Zhang:

M2GUDA: Multi-Metrics Graph-Based Unsupervised Domain Adaptation for Cross-Modal Hashing. 674-681 - Congcong Zhang, Ning He, Qixiang Sun, Xiaojie Yin, Ke Lu:

Human Pose Estimation based on Attention Multi-resolution Network. 682-687
Workshop Summaries
- Minh-Son Dao, Michael Alexander Riegler, Duc-Tien Dang-Nguyen

, Cathal Gurrin, Minh-Triet Tran
, Nguyen Thanh Binh:
ICDAR'21: Intelligent Cross-Data Analysis and Retrieval. 688-689 - Cathal Gurrin, Björn Þór Jónsson, Klaus Schöffmann, Duc-Tien Dang-Nguyen

, Jakub Lokoc, Minh-Triet Tran
, Wolfgang Hürst, Luca Rossetto
, Graham Healy
:
Introduction to the Fourth Annual Lifelog Search Challenge, LSC'21. 690-691 - Min-Chun Hu, Ichiro Ide, Kensuke Tobitani:

MMArt-ACM'21: International Joint Workshop on Multimedia Artworks Analysis and Attractiveness Computing in Multimedia 2021. 692-693 - Bei Liu, Jianlong Fu, Shizhe Chen, Qin Jin, Alexander G. Hauptmann, Yong Rui:

MMPT'21: International Joint Workshop on Multi-Modal Pre-Training for Multimedia Understanding. 694-695 - Yoko Yamakata, Atsushi Hashimoto:

CEA'21: The 13th Workshop on Multimedia for Cooking and Eating Activities. 696-697

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














