


default search action
ICMR 2022: Newark, NJ, USA
- Vincent Oria, Maria Luisa Sapino, Shin'ichi Satoh, Brigitte Kerhervé, Wen-Huang Cheng, Ichiro Ide, Vivek K. Singh:

ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27 - 30, 2022. ACM 2022, ISBN 978-1-4503-9238-9
Short Papers
- Zujie Liang, Fan Liang:

TransPCC: Towards Deep Point Cloud Compression via Transformers. 1-5 - Markus Fox, Klaus Schoeffmann:

The Impact of Dataset Splits on Classification Performance in Medical Videos. 6-10 - Xiaoyuan Guo, Jiali Duan, Saptarshi Purkayastha, Hari Trivedi

, Judy Wawira Gichoya, Imon Banerjee
:
OSCARS: An Outlier-Sensitive Content-Based Radiography Retrieval System. 11-18 - Yuma Honbu, Keiji Yanai:

Unseen Food Segmentation. 19-23 - Yinghao Wang, Haonan Chen, Jiong Wang, Yingying Zhu:

DMPCANet: A Low Dimensional Aggregation Network for Visual Place Recognition. 24-28 - Yikang Li, Jenhao Hsiao, Chiuman Ho:

VideoCLIP: A Cross-Attention Model for Fast Video-Text Retrieval Task with Image CLIP. 29-33 - Mingao Zhang, Changhong Liu, Yong Chen, Zhenchun Lei, Mingwen Wang:

Music-to-Dance Generation with Multiple Conformer. 34-38 - Wenliang Tang, Zhenzhen Hu, Zijie Song

, Richang Hong:
OCR-oriented Master Object for Text Image Captioning. 39-43 - Yongbiao Chen, Kaicheng Guo, Fangxin Liu, Yusheng Huang, Zhengwei Qi:

Supervised Contrastive Vehicle Quantization for Efficient Vehicle Retrieval. 44-48 - Rino Naka, Marie Katsurai, Keisuke Yanagi, Ryosuke Goto:

Fashion Style-Aware Embeddings for Clothing Image Retrieval. 49-53
Session 1A: Reidentification
- Shuyuan Tu, Tianzhen Guan, Li Kuang:

Multiple Biological Granularities Network for Person Re-Identification. 54-62 - Yajing Zhai, Yawen Zeng, Da Cao, Shaofei Lu

:
TriReID: Towards Multi-Modal Person Re-Identification via Descriptive Fusion Model. 63-71 - Bingliang Jiao, Liying Gao, Peng Wang:

Temporal-Consistent Visual Clue Attentive Network for Video-Based Person Re-Identification. 72-80 - Lu Yang, Hongbang Liu, Lingqiao Liu

, Jinghao Zhou, Lei Zhang, Peng Wang, Yanning Zhang:
Pluggable Weakly-Supervised Cross-View Learning for Accurate Vehicle Re-Identification. 81-89
Session 1B: Recommendations
- Yanbin Jiang, Huifang Ma, Xiaohui Zhang, Zhixin Li, Liang Chang:

An Effective Two-way Metapath Encoder over Heterogeneous Information Network for Recommendation. 90-98 - Zhuang Liu

, Yunpu Ma, Matthias Schubert, Yuanxin Ouyang, Zhang Xiong:
Multi-Modal Contrastive Pre-training for Recommendation. 99-108 - Mingda Qian, Xiaoyan Gu

, Lingyang Chu, Feifei Dai, Haihui Fan, Bo Li
:
Flexible Order Aware Sequential Recommendation. 109-117 - Jinpeng Chen, Yuan Cao, Fan Zhang, Pengfei Sun, Kaimin Wei:

Sequential Intention-aware Recommender based on User Interaction Graph. 118-126
Session 2A: Visual+Text Retrieval
- Yongbiao Chen, Sheng Zhang, Fangxin Liu, Zhigang Chang, Mang Ye

, Zhengwei Qi:
TransHash: Transformer-based Hamming Hashing for Efficient Image Retrieval. 127-136 - Zhihao Fan, Zhongyu Wei, Zejun Li, Siyuan Wang, Haijun Shan, Xuanjing Huang, Jianqing Fan:

Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval. 137-145 - Alex Falcon, Swathikiran Sudhakaran, Giuseppe Serra, Sergio Escalera

, Oswald Lanz
:
Relevance-based Margin for Contrastively-trained Video Retrieval Models. 146-157 - Yaoxin Zhuo

, Yikang Li, Jenhao Hsiao, Chiuman Ho, Baoxin Li:
CLIP4Hashing: Unsupervised Deep Hashing for Cross-Modal Video-Text Retrieval. 158-166
Session 2B: Deep Learning - Methodological Advancements
- Kenza Amara, Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou:

Nearest Neighbor Search with Compact Codes: A Decoder Perspective. 167-175 - Jan Schutte, Pascal Mettes:

Teaching a New Dog Old Tricks: Contrastive Random Walks in Videos with Unsupervised Priors. 176-184 - Shaoxiong Zhu, Qi Qi, Zirui Zhuang

, Jingyu Wang
, Haifeng Sun, Jianxin Liao:
FedNKD: A Dependable Federated Learning Using Fine-tuned Random Noise and Knowledge Distillation. 185-193 - Anqi Hu, Zhengxing Sun, Qian Li:

Weakly Supervised Fine-grained Recognition based on Combined Learning for Small Data and Coarse Label. 194-201
Demos
- Yifei Fan, Modan Xie, Peihan Wu, Gang Yang:

Real-Time Deepfake System for Live Streaming. 202-205 - Alessandro B. Melchiorre

, David Penz, Christian Ganhör
, Oleg Lesota, Vasco Fragoso, Florian Friztl, Emilia Parada-Cabaleiro, Franz Schubert, Markus Schedl:
EmoMTB: Emotion-aware Music Tower Blocks. 206-210 - Aaron Duane, Björn Þór Jónsson:

ViRMA: Virtual Reality Multimedia Analytics. 211-214 - Tingting Dong, Jianquan Liu:

Person Search by Uncertain Attributes. 215-218
Best Paper Candidates
- Yiqi Gao, Xinglin Hou, Wei Suo, Mengyang Sun, Tiezheng Ge, Yuning Jiang, Peng Wang:

Dual-Level Decoupled Transformer for Video Captioning. 219-228 - Zhongwei Xie, Lin Li, Luo Zhong, Jianquan Liu, Ling Liu

:
Cross-Modal Retrieval between Event-Dense Text and Image. 229-238 - Sheng Zeng, Changhong Liu, Jun Zhou, Yong Chen, Aiwen Jiang, Hanxi Li:

Learning Hierarchical Semantic Correspondences for Cross-Modal Image-Text Retrieval. 239-248
Session 3A: Visual+Text Retrieval
- Jianlong Wu, Liangming Pan, Jingjing Chen

, Yu-Gang Jiang:
Ingredient-enriched Recipe Generation from Cooking Videos. 249-257 - Bin Zhu, Chong-Wah Ngo, Jingjing Chen

, Wing Kwong Chan:
Cross-lingual Adaptation for Recipe Retrieval with Mixup. 258-267 - Pei Dong, Lei Wu, Lei Meng, Xiangxu Meng:

Disentangled Representations and Hierarchical Refinement of Multi-Granularity Features for Text-to-Image Synthesis. 268-276 - Haochen Sun, Lei Wu, Xiang Li, Xiangxu Meng:

Style-woven Attention Network for Zero-shot Ink Wash Painting Style Transfer. 277-285
Session 3B: Applications
- Georgios Begkas

, Panagiotis Giannakeris, Konstantinos Ioannidis, Georgios Kalpakis, Theodora Tsikrika
, Stefanos Vrochidis, Ioannis Kompatsiaris:
Automatic Visual Recognition of Unexploded Ordnances Using Supervised Deep Learning. 286-294 - Yu Yin

, Will Hutchcroft, Naji Khosravan, Ivaylo Boyadzhiev, Yun Fu, Sing Bing Kang:
Generating Topological Structure of Floorplans from Room Attributes. 295-303 - Xuan Wang, Jiajun Chen, Hao Tang

, Zhigang Zhu
:
MultiCLU: Multi-stage Context Learning and Utilization for Storefront Accessibility Detection and Evaluation. 304-312 - Yuan Chang

, Tao Peng, Ruhan He, Xinrong Hu, Junping Liu, Zili Zhang, Minghua Jiang:
UF-VTON: Toward User-Friendly Virtual Try-On Network. 313-321
Session 3C: Synchronized MM
- Peijun Bao, Yadong Mu:

Learning Sample Importance for Cross-Scenario Video Temporal Grounding. 322-329 - Suwichaya Suwanwimolkul

, Satoshi Komorita:
Efficient Linear Attention for Fast and Accurate Keypoint Matching. 330-341 - Ben Xue, Chenchen Liu, Yadong Mu:

Video2Subtitle: Matching Weakly-Synchronized Sequences via Dynamic Temporal Alignment. 342-350 - Bolin Zhang, Bin Jiang, Chao Yang, Liang Pang:

Dual-Channel Localization Networks for Moment Retrieval with Natural Language. 351-359
Session 4A: Alignment and Localization
- Sizhe Li, Chang Li, Minghang Zheng, Yang Liu:

Phrase-level Prediction for Video Temporal Localization. 360-368 - Xingyu Shen, Long Lan, Huibin Tan, Xiang Zhang, Xurui Ma, Zhigang Luo:

Joint Modality Synergy and Spatio-temporal Cue Purification for Moment Localization. 369-379 - Ru Peng, Yawen Zeng, Junbo Zhao:

HybridVocab: Towards Multi-Modal Machine Translation via Multi-Aspect Alignment. 380-388
Session 4B: Captioning and Summarization
- Yiqi Gao, Ning Wang, Wei Suo, Mengyang Sun, Peng Wang:

Improving Image Captioning via Enhancing Dual-Side Context Awareness. 389-397 - Minghao Geng, Qingjie Zhao:

Improve Image Captioning by Modeling Dynamic Scene Graph Extension. 398-406 - Evlampios Apostolidis

, Georgios Balaouras, Vasileios Mezaris, Ioannis Patras:
Summarizing Videos using Concentrated Attention and Considering the Uniqueness and Diversity of the Video Frames. 407-415
Session 5A: Applications
- Shanchuan Gao, Fankai Zeng, Lu Cheng, Jicong Fan, Mingbo Zhao:

Fashion Image Search via Anchor-Free Detector. 416-425 - Jingyu Li, Haokai Ma, Xiangxian Li

, Zhuang Qi, Lei Meng, Xiangxu Meng:
Unsupervised Contrastive Masking for Visual Haze Classification. 426-434 - Anwer Slimi, Mounir Zrigui, Henri Nicolas

:
MuLER: Multiplet-Loss for Emotion Recognition. 435-442 - Xingyu Zhu, Yingshuo Liang, Jianlei Zhang, Zengqiang Chen:

STAFNet: Swin Transformer Based Anchor-Free Network for Detection of Forward-looking Sonar Imagery. 443-450
Session 5B: Robust MM
- Chao Jiang, Yi He

, Richard Chapman, Hongyi Wu:
Camouflaged Poisoning Attack on Graph Neural Networks. 451-461 - Siyuan Li, Guangji Huang, Xing Xu, Yang Yang, Fumin Shen:

Accelerated Sign Hunter: A Sign-based Black-box Attack via Branch-Prune Strategy and Stabilized Hierarchical Search. 462-470 - Zhen Luo, Yingfang Zhang, Peihao Zhong, Jingjing Chen

, Donglong Chen:
DiGAN: Directional Generative Adversarial Network for Object Transfiguration. 471-479 - Xiaoheng Sun, Xia Liang, Qiqi He, Bilei Zhu, Zejun Ma:

GIO: A Timbre-informed Approach for Pitch Tracking in Highly Noisy Environments. 480-488
Session 5C: Action, Pose and Body
- Peipeng Chen, Andy J. Ma:

Source-free Temporal Attentive Domain Adaptation for Video Action Recognition. 489-497 - Neng Zhou, Hairu Wen

, Yi Wang, Yang Liu, Longfei Zhou:
Review of Deep Learning Models for Spine Segmentation. 498-507 - Zhidan Liu, Zhen Xing, Xiangdong Zhou, Yijiang Chen, Guichun Zhou:

3D-Augmented Contrastive Knowledge Distillation for Image-based Object Pose Estimation. 508-517 - Yiran Zhu, Guangji Huang, Xing Xu, Yanli Ji, Fumin Shen:

Selective Hypergraph Convolutional Networks for Skeleton-based Action Recognition. 518-526
Session 6: Multifarious Multimedia
- Guangyu Chen, Deyuan Zhang, Tao Liu, Xiaoyong Du:

Self-Lifting: A Novel Framework for Unsupervised Voice-Face Association Learning. 527-535 - Hongya Wang, Shunxin Dai, Ming Du, Bo Xu, Mingyong Li

:
Revisiting Performance Measures for Cross-Modal Hashing. 536-544 - Yifeng Zhuang, Qiang Sun, Yanwei Fu

, Lifeng Chen, Xiangyang Xue:
Local Slot Attention for Vision and Language Navigation. 545-553 - Yuhui Guo

, Xun Liang, Tang Hui, Bo Wu, Xiangping Zheng:
Cross-Pixel Dependency with Boundary-Feature Transformation for Weakly Supervised Semantic Segmentation. 554-561 - Kangning Yang

, Benjamin Tag
, Yue Gu, Chaofan Wang, Tilman Dingler, Greg Wadley, Jorge Gonçalves
:
Mobile Emotion Recognition via Multiple Physiological Signals using Convolution-augmented Transformer. 562-570
Special Session 1: Adversarial Learning for Multimedia Understanding and Retrieval
- Weidong Shi, Yunzhou Zhang, Shangdong Zhu

, Yixiu Liu, Sonya Coleman, Dermot Kerr:
VAC-Net: Visual Attention Consistency Network for Person Re-identification. 571-578 - Lijia Deng

, Yu-Dong Zhang
:
MFGAN: A Lightweight Fast Multi-task Multi-scale Feature-fusion Model based on GAN. 579-586 - Zhipeng Wei, Jingjing Chen

, Hao Zhang, Linxi Jiang, Yu-Gang Jiang:
Adaptive Temporal Grouping for Black-box Adversarial Attacks on Videos. 587-593
Special Session 2A: Transformer-based Multimedia Understanding: Model Design, Learning, Distillation
- Guangqi Jiang, Huibing Wang, Jinjia Peng, Xianping Fu:

Parallelism Network with Partial-aware and Cross-correlated Transformer for Vehicle Re-identification. 594-600 - Siqi Sun, Yongqing Sun, Mitsuhiro Goto, Shigekuni Kondo, Dan Mikami, Susumu Yamamoto:

Motor Learning based on Presentation of a Tentative Goal. 601-607 - Kui Xiao, Youheng Bai, Yan Zhang:

Extracting Precedence Relations between Video Lectures in MOOCs. 608-614 - Junke Wang, Zuxuan Wu, Wenhao Ouyang, Xintong Han, Jingjing Chen

, Yu-Gang Jiang, Ser-Nam Lim:
M2TR: Multi-modal Multi-scale Transformers for Deepfake Detection. 615-623
Special Session 2B: Transformer-based Multimedia Understanding: Model Design, Learning, Distillation
- Bo Fu

, Yuanxin Mao, Shilin Fu, Yonggong Ren, Zhongxuan Luo:
Blindfold Attention: Novel Mask Strategy for Facial Expression Recognition. 624-630 - Lei Zhu, Liewu Cai, Jiayu Song, Xinghui Zhu, Chengyuan Zhang, Shichao Zhang:

MSSPQ: Multiple Semantic Structure-Preserving Quantization for Cross-Modal Retrieval. 631-638
Special Session 3A: Weakly Supervised Learning for Medical Image Analysis
- Yue Wu, Yang Zhou, Jianchun Zhao, Jingyuan Yang, Weihong Yu, Youxin Chen, Xirong Li

:
Lesion Localization in OCT by Semi-Supervised Object Detection. 639-646 - Yunyan Yan, Chuanbin Liu, Hongtao Xie, Sicheng Zhang, Zhendong Mao:

Weakly Supervised Pediatric Bone Age Assessment Using Ultrasonic Images via Automatic Anatomical RoI Detection. 647-653 - Chao Suo, Xuanya Li, Donghui Tan, Yuan Zhang, Xieping Gao:

I2-Net: Intra- and Inter-scale Collaborative Learning Network for Abdominal Multi-organ Segmentation. 654-660 - Fenxia Duan, Chunhong Cao, Xieping Gao:

SA-NAS-BFNR: Spatiotemporal Attention Neural Architecture Search for Task-based Brain Functional Network Representation. 661-667
Special Session 3B: Weakly Supervised Learning for Medical Image Analysis
- Qian Wu

, Yufei Chen
, Ning Huang, Xiaodong Yue:
Weakly-supervised Cerebrovascular Segmentation Network with Shape Prior and Model Indicator. 668-676
Doctoral Symposium
- Runsheng Zhang:

FreqCAM: Frequent Class Activation Map for Weakly Supervised Object Localization. 677-680
Reproducibility Paper
- Yunqing He

, Xu Sun, Hui Jiang, Tongwei Ren, Gangshan Wu, Maria Sinziana Astefanoaei, Andreas Leibetseder:
Reproducibility Companion Paper: Human Object Interaction Detection via Multi-level Conditioned Network. 681-684
Workshop Summaries
- Cathal Gurrin

, Liting Zhou
, Graham Healy
, Björn Þór Jónsson, Duc-Tien Dang-Nguyen
, Jakub Lokoc, Minh-Triet Tran
, Wolfgang Hürst, Luca Rossetto
, Klaus Schöffmann:
Introduction to the Fifth Annual Lifelog Search Challenge, LSC'22. 685-687 - Bogdan Ionescu, Giorgos Kordopatis-Zilos, Adrian Popescu, Luca Cuccovillo, Symeon Papadopoulos:

MAD '22 Workshop: Multimedia AI against Disinformation. 688-689 - Minh-Son Dao, Michael Alexander Riegler, Duc-Tien Dang-Nguyen

, Cathal Gurrin
, Yuta Nakashima, Mianxiong Dong:
ICDAR'22: Intelligent Cross-Data Analysis and Retrieval. 690-691 - Naoko Nitta, Anita Min-Chun Hu, Kensuke Tobitani:

MMArt-ACM 2022: 5th Joint Workshop on Multimedia Artworks Analysis and Attractiveness Computing in Multimedia. 692-693

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














