


default search action
17th ECCV 2022: Tel Aviv, Israel - Volume 35
- Shai Avidan, Gabriel J. Brostow
, Moustapha Cissé, Giovanni Maria Farinella
, Tal Hassner
:
Computer Vision - ECCV 2022 - 17th European Conference, Tel Aviv, Israel, October 23-27, 2022, Proceedings, Part XXXV. Lecture Notes in Computer Science 13695, Springer 2022, ISBN 978-3-031-19832-8 - Guanxiong Sun
, Yang Hua
, Guosheng Hu
, Neil Robertson
:
Efficient One-Stage Video Object Detection by Exploiting Temporal Consistency. 1-16 - Guodong Ding
, Angela Yao
:
Leveraging Action Affinity and Continuity for Semi-supervised Temporal Action Segmentation. 17-32 - James Hong, Haotian Zhang, Michaël Gharbi, Matthew Fisher, Kayvon Fatahalian:
Spotting Temporally Precise, Fine-Grained Events in Video. 33-51 - Nadine Behrmann, S. Alireza Golestaneh, Zico Kolter, Jürgen Gall, Mehdi Noroozi:
Unified Fully and Timestamp Supervised Temporal Action Segmentation via Sequence to Sequence Translation. 52-68 - Junke Wang
, Xitong Yang
, Hengduo Li
, Li Liu, Zuxuan Wu
, Yu-Gang Jiang
:
Efficient Video Transformers with Spatial-Temporal Token Selection. 69-86 - Md Mohaiminul Islam
, Gedas Bertasius:
Long Movie Clip Classification with State-Space Video Models. 87-104 - Chen Ju, Tengda Han, Kunhao Zheng, Ya Zhang
, Weidi Xie:
Prompting Visual-Language Models for Efficient Video Understanding. 105-124 - Huan Li, Ping Wei, Jiapeng Li, Zeyu Ma, Jiahui Shang, Nanning Zheng:
Asymmetric Relation Consistency Reasoning for Video Relation Grounding. 125-141 - Jiacheng Li
, Ruize Han
, Haomin Yan, Zekun Qian, Wei Feng
, Song Wang
:
Self-supervised Social Relation Representation for Human Group Detection. 142-159 - Seong Hyeon Park
, Jihoon Tack, Byeongho Heo, Jung-Woo Ha
, Jinwoo Shin:
K-centered Patch Sampling for Efficient Video Recognition. 160-176 - Guy Erez
, Ron Shapira Weber
, Oren Freifeld
:
A Deep Moving-Camera Background Model. 177-194 - Eitan Kosman
, Dotan Di Castro:
GraphVid: It only Takes a Few Nodes to Understand a Video. 195-212 - Amirhossein Habibian, Haitam Ben Yahia, Davide Abati, Efstratios Gavves, Fatih Porikli:
Delta Distillation for Efficient Video Processing. 213-229 - David Junhao Zhang, Kunchang Li, Yali Wang, Yunpeng Chen
, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou:
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning. 230-248 - Honglu Zhou, Asim Kadav, Aviv Shamsian, Shijie Geng, Farley Lai, Long Zhao, Ting Liu, Mubbasir Kapadia, Hans Peter Graf:
COMPOSER: Compositional Reasoning of Group Activity in Videos with Keypoint-Only Modality. 249-266 - Zizhang Li
, Mengmeng Wang
, Huaijin Pi
, Kechun Xu
, Jianbiao Mei
, Yong Liu
:
E-NeRV: Expedite Neural Video Representation with Disentangled Spatial-Temporal Context. 267-284 - Guanxiong Sun
, Yang Hua
, Guosheng Hu
, Neil Robertson
:
TDViT: Temporal Dilated Video Transformer for Dense Video Tasks. 285-301 - Woobin Im, Sebin Lee, Sung-Eui Yoon:
Semi-supervised Learning of Optical Flow by Flow Supervisor. 302-318 - Nikita Dvornik, Isma Hadji, Hai X. Pham, Dhaivat Bhatt, Brais Martínez, Afsaneh Fazly, Allan D. Jepson:
Flow Graph to Video Grounding for Weakly-Supervised Multi-step Localization. 319-335 - Yiheng Li, Connelly Barnes, Kun Huang, Fang-Lue Zhang
:
Deep 360$^\circ $ Optical Flow Estimation Based on Multi-projection Fusion. 336-352 - Fanyi Xiao
, Joseph Tighe
, Davide Modolo
:
MaCLR: Motion-Aware Contrastive Learning of Representations for Videos. 353-370 - Kyle Min, Sourya Roy, Subarna Tripathi, Tanaya Guha, Somdeb Majumdar:
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection. 371-387 - Ziyi Lin, Shijie Geng, Renrui Zhang, Peng Gao, Gerard de Melo, Xiaogang Wang
, Jifeng Dai, Yu Qiao, Hongsheng Li
:
Frozen CLIP Models are Efficient Video Learners. 388-404 - Jiafei Duan
, Samson Yu
, Soujanya Poria
, Bihan Wen
, Cheston Tan
:
PIP: Physical Interaction Prediction via Mental Simulation with Span Selection. 405-421 - Heeseung Yun, Sehun Lee, Gunhee Kim:
Panoramic Vision Transformer for Saliency Detection in 360$^\circ $ Videos. 422-439 - Aditi Basu Bal
, Ramy Mounir
, Sathyanarayanan N. Aakur
, Sudeep Sarkar
, Anuj Srivastava
:
Bayesian Tracking of Video Graphs Using Joint Kalman Smoothing and Registration. 440-456 - Jingcheng Ni
, Nan Zhou
, Jie Qin
, Qian Wu
, Junqi Liu
, Boxun Li
, Di Huang
:
Motion Sensitive Contrastive Learning for Self-supervised Video Representation. 457-474 - Fuchen Long, Zhaofan Qiu, Yingwei Pan, Ting Yao, Chong-Wah Ngo, Tao Mei:
Dynamic Temporal Filtering in Video Models. 475-492 - Renrui Zhang, Wei Zhang, Rongyao Fang, Peng Gao, Kunchang Li, Jifeng Dai, Yu Qiao, Hongsheng Li
:
Tip-Adapter: Training-Free Adaption of CLIP for Few-Shot Classification. 493-510 - Lianyu Hu
, Liqing Gao
, Zekang Liu
, Wei Feng
:
Temporal Lift Pooling for Continuous Sign Language Recognition. 511-527 - Yang Jiao, Shaoxiang Chen, Zequn Jie, Jingjing Chen
, Lin Ma, Yu-Gang Jiang:
MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes. 528-545 - Mengxue Qu
, Yu Wu
, Wu Liu
, Qiqi Gong
, Xiaodan Liang
, Olga Russakovsky
, Yao Zhao
, Yunchao Wei
:
SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding. 546-562 - Jun Wang, Abhir Bhalerao, Yulan He
:
Cross-Modal Prototype Driven Network for Radiology Report Generation. 563-579 - Chuan Guo
, Xinxin Zuo
, Sen Wang
, Li Cheng:
TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts. 580-597 - Chaoyang Zhu, Yiyi Zhou, Yunhang Shen
, Gen Luo, Xingjia Pan, Mingbao Lin, Chao Chen, Liujuan Cao, Xiaoshuai Sun, Rongrong Ji:
SeqTR: A Simple Yet Universal Network for Visual Grounding. 598-615 - Laura Hanu
, James Thewlis
, Yuki M. Asano
, Christian Rupprecht
:
VTC: Improving Video-Text Retrieval with User Comments. 616-633 - Xiao Han
, Licheng Yu, Xiatian Zhu, Li Zhang, Yi-Zhe Song, Tao Xiang:
FashionViL: Fashion-Focused Vision-and-Language Representation Learning. 634-651 - Aisha Urooj Khan
, Hilde Kuehne
, Chuang Gan
, Niels da Vitoria Lobo
, Mubarak Shah
:
Weakly Supervised Grounding for VQA in Vision-Language Transformers. 652-670 - Liliane Momeni, Hannah Bull, K. R. Prajwal, Samuel Albanie, Gül Varol, Andrew Zisserman:
Automatic Dense Annotation of Large-Vocabulary Sign Language Videos. 671-690 - Yuying Ge, Yixiao Ge, Xihui Liu, Jinpeng Wang, Jianping Wu, Ying Shan, Xiaohu Qie, Ping Luo:
MILES: Visual BERT Pre-training with Injected Language Semantics for Video-Text Retrieval. 691-708 - Yuxuan Wang, Difei Gao, Licheng Yu, Weixian Lei, Matt Feiszli, Mike Zheng Shou:
GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval. 709-725 - Wei Suo, Mengyang Sun, Kai Niu, Yiqi Gao, Peng Wang, Yanning Zhang, Qi Wu:
A Simple and Robust Correlation Filtering Method for Text-Based Person Search. 726-742

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.