default search action

combined dblp search
author search
venue search
publication search

ask others

Zheng Shou 0001

Mike Zheng Shou

> Home > Persons

Person information

affiliation: National University of Singapore
affiliation (former): Columbia University, New York, NY, USA

Other persons with the same name

see FAQ

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

2020 – today

see FAQ

What is the meaning of the colors in the publication lists?

2026
[j17]
- view
  authority control:
- export record
  dblp key:
  - journals/pr/XieLLHLLZZSS26
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pr/XieLLHLLZZSS26
Jinheng Xie, Zhaochuan Luo, Rouyi Li, Yawen Huang, Haozhe Liu, Yuexiang Li, Yefeng Zheng, Yang Zhang, Linlin Shen, Mike Zheng Shou:
Open-world Weakly-Supervised Object Localization. Pattern Recognit. 169: 111808 (2026)
[i209]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2601-03928
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2601-03928
Mingyu Ouyang, Kevin Qinghong Lin, Mike Zheng Shou, Hwee Tou Ng:
FocusUI: Efficient UI Grounding via Position-Preserving Visual Token Selection. CoRR abs/2601.03928 (2026)
[i208]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2601-07181
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2601-07181
Yichun Zhang, Xiangwu Guo, Yauhong Goh, Jessica Hu, Zhiheng Chen, Xin Wang, Difei Gao, Mike Zheng Shou:
ShowUI-Aloha: Human-Taught GUI Agent. CoRR abs/2601.07181 (2026)
2025
[j16]
- view
  authority control:
- export record
  dblp key:
  - journals/ijcv/ZhangWLZRGGS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ijcv/ZhangWLZRGGS25
David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, Rui Zhao, Lingmin Ran, Yuchao Gu, Difei Gao, Mike Zheng Shou:
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation. Int. J. Comput. Vis. 133(4): 1879-1893 (2025)
[j15]
- view
  authority control:
- export record
  dblp key:
  - journals/ijcv/ZhangLLSXS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ijcv/ZhangLLSXS25
David Junhao Zhang, Dongxu Li, Hung Le, Mike Zheng Shou, Caiming Xiong, Doyen Sahoo:
MoonShot: Towards Controllable Video Generation and Editing with Motion-Aware Multimodal Conditions. Int. J. Comput. Vis. 133(6): 3629-3644 (2025)
[j14]
- view
  authority control:
- export record
  dblp key:
  - journals/ijcv/WuLHSSCLGZ25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ijcv/WuLHSSCLGZ25
Weijia Wu, Zhuang Li, Yefei He, Mike Zheng Shou, Chunhua Shen, Lele Cheng, Yan Li, Tingting Gao, Di Zhang:
Paragraph-to-Image Generation with Information-Enriched Diffusion Model. Int. J. Comput. Vis. 133(8): 5413-5434 (2025)
[j13]
- view
  authority control:
- export record
  dblp key:
  - journals/ijcv/XieDHLSHZS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ijcv/XieDHLSHZS25
Jinheng Xie, Songhe Deng, Xianxu Hou, Zhaochuan Luo, Linlin Shen, Yawen Huang, Yefeng Zheng, Mike Zheng Shou:
CLIMS++: Cross Language Image Matching with Automatic Context Discovery for Weakly Supervised Semantic Segmentation. Int. J. Comput. Vis. 133(8): 5569-5588 (2025)
[j12]
- view
  authority control:
- export record
  dblp key:
  - journals/ijon/ShiLLLYS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ijon/ShiLLLYS25
Yufei Shi, Beijia Lu, Jia-Wei Liu, Ming Li, Si Yong Yeo, Mike Zheng Shou:
ColonNeRF: High-fidelity neural reconstruction of long colonoscopy. Neurocomputing 657: 131445 (2025)
[j11]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/pami/GraumanWBCCFGHJKLLMNRRR25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pami/GraumanWBCCFGHJKLLMNRRR25
Kristen Grauman, Andrew Westbury, Eugene Byrne, Vincent Cartillier, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Devansh Kukreja, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran K. Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3,600 Hours of Egocentric Video. IEEE Trans. Pattern Anal. Mach. Intell. 47(11): 9468-9509 (2025)
[j10]
- view
  authority control:
- export record
  dblp key:
  - journals/pr/WuZLLZSB25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pr/WuZLLZSB25
Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Hong Zhou, Mike Zheng Shou, Xiang Bai:
A large cross-modal video retrieval dataset with reading comprehension. Pattern Recognit. 157: 110818 (2025)
[j9]
- view
  authority control:
- export record
  dblp key:
  - journals/tcsv/WuLCZS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tcsv/WuLCZS25
Weijia Wu, Zhuang Li, Yuanqiang Cai, Hong Zhou, Mike Zheng Shou:
A Bilingual, Open World Video Text Dataset and Real-Time Video Text Spotting With Contrastive Learning. IEEE Trans. Circuits Syst. Video Technol. 35(1): 534-546 (2025)
[j8]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - journals/tmlr/LiuZXFX0SPS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tmlr/LiuZXFX0SPS25
Haozhe Liu, Wentian Zhang, Jinheng Xie, Francesco Faccio, Mengmeng Xu, Tao Xiang, Mike Zheng Shou, Juan-Manuel Pérez-Rúa, Jürgen Schmidhuber:
Faster Diffusion Through Temporal Attention Decomposition. Trans. Mach. Learn. Res. 2025 (2025)
[c118]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/IlaslanKLSSX25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/IlaslanKLSSX25
Muhammet Furkan Ilaslan, Ali Köksal, Kevin Qinghong Lin, Burak Satar, Mike Zheng Shou, Qianli Xu:
VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting. AAAI 2025: 3886-3894
[c117]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/acl/ZhangDWHJFSZ025
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/ZhangDWHJFSZ025
Xinyu Zhang, Yuxuan Dong, Yanrui Wu, Jiaxing Huang, Chengyou Jia, Basura Fernando, Mike Zheng Shou, Lingling Zhang, Jun Liu:
PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning. ACL (1) 2025: 16593-16615
[c116]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ZhangPZKJPMSWR25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ZhangPZKJPMSWR25
David Junhao Zhang, Roni Paiss, Shiran Zada, Nikhil Karnad, David E. Jacobs, Yael Pritch, Inbar Mosseri, Mike Zheng Shou, Neal Wadhwa, Nataniel Ruiz:
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning. CVPR 2025: 2050-2062
[c115]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/0001MS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/0001MS25
Rui Zhao, Weijia Mao, Mike Zheng Shou:
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles. CVPR 2025: 2835-2846
[c114]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/Song0CS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/Song0CS25
Yiren Song, Pei Yang, Hai Ci, Mike Zheng Shou:
IDProtector: An Adversarial Noise Encoder to Protect Against ID-Preserving Image Generation. CVPR 2025: 3019-3028
[c113]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/LinS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/LinS25
Kevin Qinghong Lin, Mike Zheng Shou:
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary. CVPR 2025: 3218-3228
[c112]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/MeiZS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/MeiZS25
Haiyang Mei, Pengyu Zhang, Mike Zheng Shou:
SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost. CVPR 2025: 3417-3426
[c111]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/LinLGYWBLWS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/LinLGYWBLWS25
Kevin Qinghong Lin, Linjie Li, Difei Gao, Zhengyuan Yang, Shiwei Wu, Zechen Bai, Stan Weixian Lei, Lijuan Wang, Mike Zheng Shou:
ShowUI: One Vision-Language-Action Model for GUI Visual Agent. CVPR 2025: 19498-19508
[c110]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/GuZYNY0LS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/GuZYNY0LS25
Yuchao Gu, Yipin Zhou, Yunfan Ye, Yixin Nie, Licheng Yu, Pingchuan Ma, Kevin Qinghong Lin, Mike Zheng Shou:
ROICtrl: Boosting Instance Control for Visual Generation. CVPR 2025: 23658-23667
[c109]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/Wu0TR0SFGL25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/Wu0TR0SFGL25
Jay Zhangjie Wu, Yuxuan Zhang, Haithem Turki, Xuanchi Ren, Jun Gao, Mike Zheng Shou, Sanja Fidler, Zan Gojcic, Huan Ling:
DIFIX3D+: Improving 3D Reconstructions with Single-Step Diffusion Models. CVPR 2025: 26024-26035
[c108]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/0001LZXF0LSS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/0001LZXF0LSS25
Weijia Wu, Mingyu Liu, Zeyu Zhu, Xi Xia, Haoen Feng, Wen Wang, Kevin Qinghong Lin, Chunhua Shen, Mike Zheng Shou:
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation. CVPR 2025: 28984-28994
[c107]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ChenZLLMS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ChenZLLMS25
Joya Chen, Ziyun Zeng, Yiqi Lin, Wei Li, Zejun Ma, Mike Zheng Shou:
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale. CVPR 2025: 29083-29095
[c106]
- view
  - electronic edition @ aclanthology.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/emnlp/ZhaoPTMS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/ZhaoPTMS25
Henry Hengyuan Zhao, Wenqi Pei, Yifei Tao, Haiyang Mei, Mike Zheng Shou:
InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models with Human Feedback. EMNLP (Findings) 2025: 25381-25400
[c105]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/BaiX0W0BS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/BaiX0W0BS25
Zechen Bai, Tianjun Xiao, Tong He, Pichao Wang, Zheng Zhang, Thomas Brox, Mike Zheng Shou:
Bridging Information Asymmetry in Text-video Retrieval: A Data-centric Approach. ICLR 2025
[c104]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/JiaoZLZGSS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/JiaoZLZGSS25
Siyi Jiao, Wenzheng Zeng, Yerong Li, Huayu Zhang, Changxin Gao, Nong Sang, Mike Zheng Shou:
MP-Mat: A 3D-and-Instance-Aware Human Matting and Editing Framework with Multiplane Representation. ICLR 2025
[c103]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/LeiGS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/LeiGS25
Weixian Lei, Difei Gao, Mike Zheng Shou:
Grounding Multimodal Large Language Model in GUI World. ICLR 2025
[c102]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/LiuSCZWSB25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/LiuSCZWSB25
Yepeng Liu, Yiren Song, Hai Ci, Yu Zhang, Haofan Wang, Mike Zheng Shou, Yuheng Bu:
Image Watermarks are Removable using Controllable Regeneration from Clean Noise. ICLR 2025
[c101]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/XieMBZWLGCYS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/XieMBZWLGCYS25
Jinheng Xie, Weijia Mao, Zechen Bai, David Junhao Zhang, Weihao Wang, Kevin Qinghong Lin, Yuchao Gu, Zhijie Chen, Zhenheng Yang, Mike Zheng Shou:
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation. ICLR 2025
[c100]
- view
- export record
  dblp key:
  - conf/icml/BaiCS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/BaiCS25
Zechen Bai, Hai Ci, Mike Zheng Shou:
Impossible Videos. ICML 2025
[c99]
- view
- export record
  dblp key:
  - conf/icml/CiS0XS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icml/CiS0XS25
Hai Ci, Yiren Song, Pei Yang, Jinheng Xie, Mike Zheng Shou:
WMAdapter: Adding WaterMark Control to Latent Diffusion Models. ICML 2025
[c98]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/WuGLWS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/WuGLWS25
Qinchen Wu, Difei Gao, Qinghong Lin, Zhuoyu Wu, Mike Zheng Shou:
GUI-Narrator: Detecting and Captioning Computer GUI Actions. ACM Multimedia 2025: 3683-3692
[c97]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/MeiGWYS25
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/MeiGWYS25
Haiyang Mei, Difei Gao, Xiaopeng Wei, Xin Yang, Mike Zheng Shou:
Can I Trust You? Advancing GUI Task Automation with Action Trust Score. ACM Multimedia 2025: 8692-8700
[i207]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2502-01105
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2502-01105
Yiren Song, Danze Chen, Mike Zheng Shou:
LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer. CoRR abs/2502.01105 (2025)
[i206]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2502-01572
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2502-01572
Yiren Song, Cheng Liu, Mike Zheng Shou:
MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation. CoRR abs/2502.01572 (2025)
[i205]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2502-06474
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2502-06474
Weijia Mao, Zhenheng Yang, Mike Zheng Shou:
UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths. CoRR abs/2502.06474 (2025)
[i204]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2502-08047
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2502-08047
Henry Hengyuan Zhao, Difei Gao, Mike Zheng Shou:
WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation. CoRR abs/2502.08047 (2025)
[i203]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2502-12054
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2502-12054
Xinyu Zhang, Yuxuan Dong, Yanrui Wu, Jiaxing Huang, Chengyou Jia, Basura Fernando, Mike Zheng Shou, Lingling Zhang, Jun Liu:
PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning. CoRR abs/2502.12054 (2025)
[i202]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2502-14397
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2502-14397
Shijie Huang, Yiren Song, Yuxuan Zhang, Hailong Guo, Xueyin Wang, Mike Zheng Shou, Jiaming Liu:
PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data. CoRR abs/2502.14397 (2025)
[i201]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2502-15027
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2502-15027
Henry Hengyuan Zhao, Wenqi Pei, Yifei Tao, Haiyang Mei, Mike Zheng Shou:
InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback. CoRR abs/2502.15027 (2025)
[i200]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-01774
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-01774
Jay Zhangjie Wu, Yuxuan Zhang, Haithem Turki, Xuanchi Ren, Jun Gao, Mike Zheng Shou, Sanja Fidler, Zan Gojcic, Huan Ling:
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models. CoRR abs/2503.01774 (2025)
[i199]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-03651
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-03651
Rui Zhao, Weijia Mao, Mike Zheng Shou:
DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles. CoRR abs/2503.03651 (2025)
[i198]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-07314
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-07314
Weijia Wu, Zeyu Zhu, Mike Zheng Shou:
Automated Movie Generation via Multi-Agent CoT Planning. CoRR abs/2503.07314 (2025)
[i197]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-07601
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-07601
Yuxin Jiang, Liming Jiang, Shuai Yang, Jia-Wei Liu, Ivor W. Tsang, Mike Zheng Shou:
Balanced Image Stylization with Style Matching Score. CoRR abs/2503.07601 (2025)
[i196]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-09241
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-09241
Pei Yang, Hai Ci, Mike Zheng Shou:
In-Context Defense in Computer Agents: An Empirical Study. CoRR abs/2503.09241 (2025)
[i195]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-09402
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-09402
Kevin Qinghong Lin, Mike Zheng Shou:
VLog: Video-Language Models by Generative Retrieval of Narration Vocabulary. CoRR abs/2503.09402 (2025)
[i194]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-09566
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-09566
Lingmin Ran, Mike Zheng Shou:
TPDiff: Temporal Pyramid Video Diffusion Model. CoRR abs/2503.09566 (2025)
[i193]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-13327
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-13327
Lan Chen, Qi Mao, Yuchao Gu, Mike Zheng Shou:
Edit Transfer: Learning Image Editing via Vision In-Context Relations. CoRR abs/2503.13327 (2025)
[i192]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-13444
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-13444
Ye Liu, Kevin Qinghong Lin, Chang Wen Chen, Mike Zheng Shou:
VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning. CoRR abs/2503.13444 (2025)
[i191]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-14378
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-14378
Zechen Bai, Hai Ci, Mike Zheng Shou:
Impossible Videos. CoRR abs/2503.14378 (2025)
[i190]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-19325
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-19325
Yuchao Gu, Weijia Mao, Mike Zheng Shou:
Long-Context Autoregressive Video Modeling with Next-Frame Prediction. CoRR abs/2503.19325 (2025)
[i189]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2503-21904
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2503-21904
Zhiwei Yang, Chen Gao, Jing Liu, Peng Wu, Guansong Pang, Mike Zheng Shou:
AssistPDA: An Online Video Surveillance Assistant for Video Anomaly Prediction, Detection, and Analysis. CoRR abs/2503.21904 (2025)
[i188]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2504-05594
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2504-05594
Qi Mao, Lan Chen, Yuchao Gu, Mike Zheng Shou, Ming-Hsuan Yang:
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model. CoRR abs/2504.05594 (2025)
[i187]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2504-14606
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2504-14606
Siyi Jiao, Wenzheng Zeng, Yerong Li, Huayu Zhang, Changxin Gao, Nong Sang, Mike Zheng Shou:
MP-Mat: A 3D-and-Instance-Aware Human Matting and Editing Framework with Multiplane Representation. CoRR abs/2504.14606 (2025)
[i186]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2504-16030
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2504-16030
Joya Chen, Ziyun Zeng, Yiqi Lin, Wei Li, Zejun Ma, Mike Zheng Shou:
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale. CoRR abs/2504.16030 (2025)
[i185]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-13300
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-13300
Zekai Li, Xinhao Zhong, Samir Khaki, Zhiyuan Liang, Yuhao Zhou, Mingjia Shi, Ziqiao Wang, Xuanlei Zhao, Wangbo Zhao, Ziheng Qin, Mengxuan Wu, Pengfei Zhou, Haonan Wang, David Junhao Zhang, Jia-Wei Liu, Shaobo Wang, Dai Liu, Linfeng Zhang, Guang Li, Kun Wang, Zheng Zhu, Zhiheng Ma, Joey Tianyi Zhou, Jiancheng Lv, Yaochu Jin, Peihao Wang, Kaipeng Zhang, Lingjuan Lyu, Yiran Huang, Zeynep Akata, Zhiwei Deng, Xindi Wu, George Cazenavette, Yuzhang Shang, Justin Cui, Jindong Gu, Qian Zheng, Hao Ye, Shuo Wang, Xiaobo Wang, Yan Yan, Angela Yao, Mike Zheng Shou, Tianlong Chen, Hakan Bilen, Baharan Mirzasoleiman, Manolis Kellis, Konstantinos N. Plataniotis, Zhangyang Wang, Bo Zhao, Yang You, Kai Wang:
DD-Ranking: Rethinking the Evaluation of Dataset Distillation. CoRR abs/2505.13300 (2025)
[i184]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-16854
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-16854
Jiaqi Wang, Kevin Qinghong Lin, James Cheng, Mike Zheng Shou:
Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models. CoRR abs/2505.16854 (2025)
[i183]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-18445
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-18445
Yiren Song, Cheng Liu, Mike Zheng Shou:
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data. CoRR abs/2505.18445 (2025)
[i182]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-23380
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-23380
Weijia Mao, Zhenheng Yang, Mike Zheng Shou:
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning. CoRR abs/2505.23380 (2025)
[i181]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2505-23660
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2505-23660
Ziteng Gao, Mike Zheng Shou:
D-AR: Diffusion via Autoregressive Models. CoRR abs/2505.23660 (2025)
[i180]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-01304
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-01304
Haiyang Mei, Pengyu Zhang, Mike Zheng Shou:
SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost. CoRR abs/2506.01304 (2025)
[i179]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-04135
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-04135
Pei Yang, Hai Ci, Mike Zheng Shou:
macOSWorld: A Multilingual Interactive Benchmark for GUI Agents. CoRR abs/2506.04135 (2025)
[i178]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-15564
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-15564
Jinheng Xie, Zhenheng Yang, Mike Zheng Shou:
Show-o2: Improved Native Unified Multimodal Models. CoRR abs/2506.15564 (2025)
[i177]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2506-17301
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2506-17301
Guian Fang, Yuchao Gu, Mike Zheng Shou:
FramePrompt: In-context Controllable Animation with Zero Structural Changes. CoRR abs/2506.17301 (2025)
[i176]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2507-17294
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2507-17294
Jianxin Bi, Kevin Yuchen Ma, Ce Hao, Mike Zheng Shou, Harold Soh:
VLA-Touch: Enhancing Vision-Language-Action Models with Dual-Level Tactile Feedback. CoRR abs/2507.17294 (2025)
[i175]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2508-03050
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2508-03050
Zeyu Zhu, Weijia Wu, Mike Zheng Shou:
Multi-human Interactive Talking Dataset. CoRR abs/2508.03050 (2025)
[i174]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2508-08189
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2508-08189
Weijia Wu, Chen Gao, Joya Chen, Kevin Qinghong Lin, Qingwei Meng, Yiming Zhang, Yuke Qiu, Hong Zhou, Mike Zheng Shou:
Reinforcement Learning in Vision: A Survey. CoRR abs/2508.08189 (2025)
[i173]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2508-19852
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2508-19852
Binjie Zhang, Mike Zheng Shou:
Ego-centric Predictive Model Conditioned on Hand Trajectories. CoRR abs/2508.19852 (2025)
[i172]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2508-21727
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2508-21727
Jiazheng Xing, Hai Ci, Hongbin Xu, Hangjie Yuan, Yong Liu, Mike Zheng Shou:
OptMark: Robust Multi-bit Diffusion Watermarking via Inference Time Optimization. CoRR abs/2508.21727 (2025)
[i171]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2509-01986
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2509-01986
Ziyun Zeng, Junhao Zhang, Wei Li, Mike Zheng Shou:
Draw-In-Mind: Rebalancing Designer-Painter Roles in Unified Multimodal Models Benefits Image Editing. CoRR abs/2509.01986 (2025)
[i170]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2509-22010
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2509-22010
Xinyu Zhang, Yuxuan Dong, Lingling Zhang, Chengyou Jia, Zhuohang Dang, Basura Fernando, Jun Liu, Mike Zheng Shou:
CoFFT: Chain of Foresight-Focus Thought for Visual Language Models. CoRR abs/2509.22010 (2025)
[i169]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2509-25172
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2509-25172
Yuxin Jiang, Yuchao Gu, Yiren Song, Ivor W. Tsang, Mike Zheng Shou:
Personalized Vision via Visual In-Context Learning. CoRR abs/2509.25172 (2025)
[i168]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2509-26386
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2509-26386
Zhiwei Yang, Chen Gao, Mike Zheng Shou:
PANDA: Towards Generalist Video Anomaly Detection via Agentic AI Engineer. CoRR abs/2509.26386 (2025)
[i167]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2510-01174
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2510-01174
Yanzhe Chen, Kevin Qinghong Lin, Mike Zheng Shou:
Code2Video: A Code-centric Paradigm for Educational Video Generation. CoRR abs/2510.01174 (2025)
[i166]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2510-05096
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2510-05096
Zeyu Zhu, Kevin Qinghong Lin, Mike Zheng Shou:
Paper2Video: Automatic Video Generation from Scientific Papers. CoRR abs/2510.05096 (2025)
[i165]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2510-06068
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2510-06068
Heng Zhang, Kevin Yuchen Ma, Mike Zheng Shou, Weisi Lin, Yan Wu:
Cross-Embodiment Dexterous Hand Articulation Generation via Morphology-Aware Learning. CoRR abs/2510.06068 (2025)
[i164]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2510-18703
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2510-18703
Yiqi Lin, Alex Jinpeng Wang, Linjie Li, Zhengyuan Yang, Mike Zheng Shou:
Exploring a Unified Vision-Centric Contrastive Alternatives on Multi-Modal Web Documents. CoRR abs/2510.18703 (2025)
[i163]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-06417
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-06417
Xiangwu Guo, Difei Gao, Mike Zheng Shou:
AUTO-Explorer: Automated Data Collection for GUI Agent. CoRR abs/2511.06417 (2025)
[i162]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-15567
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-15567
Kevin Qinghong Lin, Siyuan Hu, Linjie Li, Zhengyuan Yang, Lijuan Wang, Philip Torr, Mike Zheng Shou:
Computer-Use Agents as Judges for Generative User Interface. CoRR abs/2511.15567 (2025)
[i161]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-18673
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-18673
Yiqing Shi, Yiren Song, Mike Zheng Shou:
Edit2Perceive: Image Editing Diffusion Models Are Strong Dense Perceivers. CoRR abs/2511.18673 (2025)
[i160]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-19111
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-19111
Hai Ci, Ziheng Peng, Pei Yang, Yingxin Xuan, Mike Zheng Shou:
DiffSeg30k: A Multi-Turn Diffusion Editing Benchmark for Localized AIGC Detection. CoRR abs/2511.19111 (2025)
[i159]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-20256
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-20256
Weijia Mao, Hao Chen, Zhenheng Yang, Mike Zheng Shou:
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation. CoRR abs/2511.20256 (2025)
[i158]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-20614
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-20614
Ziheng Ouyang, Yiren Song, Yaoli Liu, Shihao Zhu, Qibin Hou, Ming-Ming Cheng, Mike Zheng Shou:
The Consistency Critic: Correcting Inconsistencies in Generated Images via Reference-Guided Attentive Alignment. CoRR abs/2511.20614 (2025)
[i157]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-22098
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-22098
Quanjian Song, Yiren Song, Kelly Peng, Yuan Gao, Mike Zheng Shou:
WorldWander: Bridging Egocentric and Exocentric Worlds in Video Generation. CoRR abs/2511.22098 (2025)
[i156]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2511-22950
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2511-22950
Haiyang Mei, Qiming Huang, Hai Ci, Mike Zheng Shou:
RobotSeg: A Model and Dataset for Segmenting Robots in Image and Video. CoRR abs/2511.22950 (2025)
[i155]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2512-04537
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2512-04537
Pei Yang, Hai Ci, Yiren Song, Mike Zheng Shou:
X-Humanoid: Robotize Human Videos to Generate Humanoid Videos at Scale. CoRR abs/2512.04537 (2025)
[i154]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2512-09247
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2512-09247
Cheng Liu, Yiren Song, Haofan Wang, Mike Zheng Shou:
OmniPSD: Layered PSD Generation with Diffusion Transformer. CoRR abs/2512.09247 (2025)
[i153]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2512-09406
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2512-09406
Hai Ci, Xiaokang Liu, Pei Yang, Yiren Song, Mike Zheng Shou:
H2R-Grounder: A Paired-Data-Free Paradigm for Translating Human Interaction Videos into Physically Grounded Robot Videos. CoRR abs/2512.09406 (2025)
[i152]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2512-14666
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2512-14666
Zechen Bai, Chen Gao, Mike Zheng Shou:
EVOLVE-VLA: Test-Time Training from Environment Feedback for Vision-Language-Action Models. CoRR abs/2512.14666 (2025)
[i151]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2512-17253
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2512-17253
Yiren Song, Cheng Liu, Weijia Mao, Mike Zheng Shou:
Mitty: Diffusion-based Human-to-Robot Video Generation. CoRR abs/2512.17253 (2025)
[i150]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2512-24097
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2512-24097
Wenzheng Zeng, Difei Gao, Mike Zheng Shou, Hwee Tou Ng:
Factorized Learning for Temporally Grounded Video-Language Models. CoRR abs/2512.24097 (2025)
[i149]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2512-24965
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2512-24965
Siyuan Hu, Kevin Qinghong Lin, Mike Zheng Shou:
ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands. CoRR abs/2512.24965 (2025)
2024
[j7]
- view
  authority control:
- export record
  dblp key:
  - journals/ijcv/ZhaoWZ00S24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/ijcv/ZhaoWZ00S24
Henry Hengyuan Zhao, Pichao Wang, Yuyang Zhao, Hao Luo, Fan Wang, Mike Zheng Shou:
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels. Int. J. Comput. Vis. 132(3): 731-749 (2024)
[j6]
- view
  authority control:
- export record
  dblp key:
  - journals/pami/WangZSY24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/pami/WangZSY24
Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan:
Enhancing Visual Grounding in Vision-Language Pre-Training With Position-Guided Text Prompts. IEEE Trans. Pattern Anal. Mach. Intell. 46(5): 3406-3421 (2024)
[j5]
- view
  authority control:
- export record
  dblp key:
  - journals/tcsv/WuZLSZS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tcsv/WuZLSZS24
Weijia Wu, Yuzhong Zhao, Zhuang Li, Lianlei Shan, Hong Zhou, Mike Zheng Shou:
Continual Learning for Image Segmentation With Dynamic Query. IEEE Trans. Circuits Syst. Video Technol. 34(6): 4874-4886 (2024)
[j4]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/tkde/ZhangCOSTTXYZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tkde/ZhangCOSTTXYZ24
Bingxue Zhang, Gang Chen, Beng Chin Ooi, Mike Zheng Shou, Kian-Lee Tan, Anthony K. H. Tung, Xiaokui Xiao, James Wei Luen Yip, Meihui Zhang:
Managing Metaverse Data Tsunami: Actionable Insights. IEEE Trans. Knowl. Data Eng. 36(12): 7423-7441 (2024)
[j3]
- view
  authority control:
- export record
  dblp key:
  - journals/tmm/LiFHFLKS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tmm/LiFHFLKS24
Ming Li, Huazhu Fu, Shengfeng He, Hehe Fan, Jun Liu, Jussi Keppo, Mike Zheng Shou:
DR-FER: Discriminative and Robust Representation Learning for Facial Expression Recognition. IEEE Trans. Multim. 26: 6297-6309 (2024)
[c96]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/XuZLYLZFS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/XuZLYLZFS24
Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Hanshu Yan, Jia-Wei Liu, Chenxu Zhang, Jiashi Feng, Mike Zheng Shou:
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model. CVPR 2024: 1481-1490
[c95]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/GuZWYL0WZST24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/GuZWYL0WZST24
Yuchao Gu, Yipin Zhou, Bichen Wu, Licheng Yu, Jia-Wei Liu, Rui Zhao, Jay Zhangjie Wu, David Junhao Zhang, Mike Zheng Shou, Kevin Tang:
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence. CVPR 2024: 7621-7630
[c94]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/GuWGSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/GuWGSS24
Yuchao Gu, Xintao Wang, Yixiao Ge, Ying Shan, Mike Zheng Shou:
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis. CVPR 2024: 7631-7640
[c93]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/LiuCWMG0KSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/LiuCWMG0KSS24
Jia-Wei Liu, Yan-Pei Cao, Jay Zhangjie Wu, Weijia Mao, Yuchao Gu, Rui Zhao, Jussi Keppo, Ying Shan, Mike Zheng Shou:
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing. CVPR 2024: 7664-7674
[c92]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/RanCL0ZWKS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/RanCL0ZWKS24
Lingmin Ran, Xiaodong Cun, Jia-Wei Liu, Rui Zhao, Song Zijie, Xintao Wang, Jussi Keppo, Mike Zheng Shou:
X- Adapter: Universal Compatibility of Plugins for Upgraded Diffusion Model. CVPR 2024: 8775-8784
[c91]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/Gao0BOLMWZWGWZS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/Gao0BOLMWZWGWZS24
Difei Gao, Lei Ji, Zechen Bai, Mingyu Ouyang, Peiran Li, Dongxing Mao, Qinchen Wu, Weichen Zhang, Peiyi Wang, Xiangwu Guo, Hengxu Wang, Luowei Zhou, Mike Zheng Shou:
AssistGUI: Task-Oriented PC Graphical User Interface Automation. CVPR 2024: 13289-13298
[c90]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/XieD0LH0SGSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/XieD0LH0SGSS24
Jinheng Xie, Songhe Deng, Bing Li, Haozhe Liu, Yawen Huang, Yefeng Zheng, Jürgen Schmidhuber, Bernard Ghanem, Linlin Shen, Mike Zheng Shou:
Tune-an-Ellipse: CLIP Has Potential to Find what you Want. CVPR 2024: 13723-13732
[c89]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/GaoTLCS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/GaoTLCS24
Ziteng Gao, Zhan Tong, Kevin Qinghong Lin, Joya Chen, Mike Zheng Shou:
Bootstrapping SparseFormers from Vision Foundation Models. CVPR 2024: 17710-17721
[c88]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ChenLWLSGLGMS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ChenLWLSGLGMS24
Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, Jia-Wei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou:
VideoLLM-online: Online Video Large Language Model for Streaming Video. CVPR 2024: 18407-18418
[c87]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/Sun0FGMS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/Sun0FGMS24
Jingtao Sun, Yaonan Wang, Mingtao Feng, Yulan Guo, Ajmal Mian, Mike Zheng Shou:
L4D-Track: Language-to-4D Modeling Towards 6-DoF Tracking and Shape Reconstruction in 3D Point Cloud Stream. CVPR 2024: 21146-21156
[c86]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/LeiGYZGSGSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/LeiGYZGSGSS24
Weixian Lei, Yixiao Ge, Kun Yi, Jianfeng Zhang, Difei Gao, Dylan Sun, Yuying Ge, Ying Shan, Mike Zheng Shou:
VIT-LENS: Towards Omni-modal Representations. CVPR 2024: 26637-26647
[c85]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/ZhaoZS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/ZhaoZS24
Henry Hengyuan Zhao, Pan Zhou, Mike Zheng Shou:
GENIXER: Empowering Multimodal Large Language Model as a Powerful Data Generator. ECCV (23) 2024: 129-147
[c84]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/ZhaoGWZLWKS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/ZhaoGWZLWKS24
Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, David Junhao Zhang, Jia-Wei Liu, Weijia Wu, Jussi Keppo, Mike Zheng Shou:
MotionDirector: Motion Customization of Text-to-Video Diffusion Models. ECCV (56) 2024: 273-290
[c83]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/WuLGZHZSLGZ24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/WuLGZHZSLGZ24
Weijia Wu, Zhuang Li, Yuchao Gu, Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, Di Zhang:
DragAnything: Motion Control for Anything Using Entity Representation. ECCV (22) 2024: 331-348
[c82]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/CiYSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/CiYSS24
Hai Ci, Pei Yang, Yiren Song, Mike Zheng Shou:
RingID: Rethinking Tree-Ring Watermarking for Enhanced Multi-key Identification. ECCV (28) 2024: 338-354
[c81]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/LinHWWLS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/LinHWWLS24
Yiqi Lin, Conghui He, Alex Jinpeng Wang, Bin Wang, Weijia Li, Mike Zheng Shou:
Parrot Captions Teach CLIP to Spot Text. ECCV (42) 2024: 368-385
[c80]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/LinZGXCGXXS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/LinZGXCGXXS24
Kevin Qinghong Lin, Pengchuan Zhang, Difei Gao, Xide Xia, Joya Chen, Ziteng Gao, Jinheng Xie, Xuhong Xiao, Mike Zheng Shou:
Learning Video Context as Interleaved Multimodal Sequences. ECCV (49) 2024: 375-396
[c79]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/ZhangXWXZHBS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/ZhangXWXZHBS24
David Junhao Zhang, Mutian Xu, Jay Zhangjie Wu, Chuhui Xue, Wenqing Zhang, Xiaoguang Han, Song Bai, Mike Zheng Shou:
Free-ATM: Harnessing Free Attention Masks for Representation Learning on Diffusion-Generated Images. ECCV (40) 2024: 465-482
[c78]
- view
  authority control:
- export record
  dblp key:
  - conf/hcma/GaoHLS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/hcma/GaoHLS24
Difei Gao, Siyuan Hu, Qinghong Lin, Mike Zheng Shou:
AssistGPT: Towards Multi-modal Agent for Human-Centric AI Assistant. HCMA@MM 2024: 3-5
[c77]
- view
  authority control:
- export record
  dblp key:
  - conf/icassp/SongWZS024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icassp/SongWZS024
Zeyang Song, Jibin Wu, Malu Zhang, Mike Zheng Shou, Haizhou Li:
Spiking-Leaf: A Learnable Auditory Front-End for Spiking Neural Networks. ICASSP 2024: 226-230
[c76]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/GaoT0S24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/GaoT0S24
Ziteng Gao, Zhan Tong, Limin Wang, Mike Zheng Shou:
SparseFormer: Sparse Visual Recognition via Limited Latent Tokens. ICLR 2024
[c75]
- view
  - electronic edition @ ijcai.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ijcai/WangMBWS00024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/WangMBWS00024
Yang Wang, Haiyang Mei, Qirui Bao, Ziqi Wei, Mike Zheng Shou, Haizhou Li, Bo Dong, Xin Yang:
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition. IJCAI 2024: 3160-3168
[c74]
- view
  - electronic edition @ ijcai.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/ijcai/HuLGTW0S24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/ijcai/HuLGTW0S24
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou:
Delocate: Detection and Localization for Deepfake Videos with Randomly-Located Tampered Traces. IJCAI 2024: 5862-5871
[c73]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/MaoCGFS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/MaoCGFS24
Qi Mao, Lan Chen, Yuchao Gu, Zhen Fang, Mike Zheng Shou:
MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance. ACM Multimedia 2024: 6842-6850
[c72]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/GaoHBLS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/GaoHBLS24
Difei Gao, Siyuan Hu, Zechen Bai, Qinghong Lin, Mike Zheng Shou:
AssistEditor: Multi-Agent Collaboration for GUI Workflow Automation in Video Creation. ACM Multimedia 2024: 11255-11257
[c71]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/Bai0MWGClZS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/Bai0MWGClZS24
Zechen Bai, Tong He, Haiyang Mei, Pichao Wang, Ziteng Gao, Joya Chen, Lei Liu, Zheng Zhang, Mike Zheng Shou:
One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos. NeurIPS 2024
[c70]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/LinLGWYYWS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LinLGWYYWS24
Kevin Qinghong Lin, Linjie Li, Difei Gao, Qinchen Wu, Mingyi Yan, Zhengyuan Yang, Lijuan Wang, Mike Zheng Shou:
VideoGUI: A Benchmark for GUI Automation from Instructional Videos. NeurIPS 2024
[c69]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/LiuMXKS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LiuMXKS24
Jia-Wei Liu, Weijia Mao, Zhongcong Xu, Jussi Keppo, Mike Zheng Shou:
Exocentric-to-Egocentric Video Generation. NeurIPS 2024
[c68]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/MaXZWRYZWS024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/MaXZWRYZWS024
Feipeng Ma, Hongwei Xue, Yizhou Zhou, Guangting Wang, Fengyun Rao, Shilin Yan, Yueyi Zhang, Siying Wu, Mike Zheng Shou, Xiaoyan Sun:
Visual Perception by Large Language Model's Weights. NeurIPS 2024
[c67]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/WangLLLWS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WangLLLWS24
Alex Jinpeng Wang, Linjie Li, Yiqi Lin, Min Li, Lijuan Wang, Mike Zheng Shou:
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning. NeurIPS 2024
[c66]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/WuCLWGX0HCS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WuCLWGX0HCS24
Shiwei Wu, Joya Chen, Kevin Qinghong Lin, Qimeng Wang, Yan Gao, Qianli Xu, Tong Xu, Yao Hu, Enhong Chen, Mike Zheng Shou:
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation. NeurIPS 2024
[c65]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/XuSMBFS024
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/XuSMBFS024
Binqian Xu, Xiangbo Shu, Haiyang Mei, Zechen Bai, Basura Fernando, Mike Zheng Shou, Jinhui Tang:
DoFIT: Domain-aware Federated Instruction Tuning with Alleviated Catastrophic Forgetting. NeurIPS 2024
[c64]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/YangCSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/YangCSS24
Pei Yang, Hai Ci, Yiren Song, Mike Zheng Shou:
Can Simple Averaging Defeat Modern Watermarks? NeurIPS 2024
[c63]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/YeLJSS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/YeLJSS24
Zijie Ye, Jia-Wei Liu, Jia Jia, Shikun Sun, Mike Zheng Shou:
Skinned Motion Retargeting with Dense Geometric Interaction Perception. NeurIPS 2024
[c62]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/Zhao0GBS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/Zhao0GBS24
Henry Hengyuan Zhao, Pan Zhou, Difei Gao, Zechen Bai, Mike Zheng Shou:
LOVA3: Learning to Visual Question Answering, Asking and Assessment. NeurIPS 2024
[c61]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/ZhaoYWZGRWWZZS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/ZhaoYWZGRWWZZS24
Rui Zhao, Hangjie Yuan, Yujie Wei, Shiwei Zhang, Yuchao Gu, Lingmin Ran, Xiang Wang, Jay Zhangjie Wu, David Junhao Zhang, Yingya Zhang, Mike Zheng Shou:
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models. NeurIPS 2024
[c60]
- view
  authority control:
- export record
  dblp key:
  - conf/siggrapha/SongHYCYLZS24
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/siggrapha/SongHYCYLZS24
Yiren Song, Shijie Huang, Chen Yao, Hai Ci, Xiaojun Ye, Jiaming Liu, Yuxuan Zhang, Mike Zheng Shou:
ProcessPainter: Learning to draw from sequence data. SIGGRAPH Asia 2024: 18:1-18:10
[i148]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-00849
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-00849
Alex Jinpeng Wang, Linjie Li, Kevin Qinghong Lin, Jianfeng Wang, Kevin Lin, Zhengyuan Yang, Lijuan Wang, Mike Zheng Shou:
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training. CoRR abs/2401.00849 (2024)
[i147]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-01827
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-01827
David Junhao Zhang, Dongxu Li, Hung Le, Mike Zheng Shou, Caiming Xiong, Doyen Sahoo:
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions. CoRR abs/2401.01827 (2024)
[i146]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-07781
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-07781
Jay Zhangjie Wu, Guian Fang, Haoning Wu, Xintao Wang, Yixiao Ge, Xiaodong Cun, David Junhao Zhang, Jia-Wei Liu, Yuchao Gu, Rui Zhao, Weisi Lin, Wynne Hsu, Ying Shan, Mike Zheng Shou:
Towards A Better Metric for Text-to-Video Generation. CoRR abs/2401.07781 (2024)
[i145]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2401-13516
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2401-13516
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou:
Delocate: Detection and Localization for Deepfake Videos with Randomly-Located Tampered Traces. CoRR abs/2401.13516 (2024)
[i144]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-01345
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-01345
Zongbo Han, Zechen Bai, Haiyang Mei, Qianli Xu, Changqing Zhang, Mike Zheng Shou:
Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models. CoRR abs/2402.01345 (2024)
[i143]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2402-13724
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2402-13724
Zechen Bai, Peng Chen, Xiaolan Peng, Lu Liu, Hui Chen, Mike Zheng Shou, Feng Tian:
Bring Your Own Character: A Holistic Solution for Automatic Facial Animation Generation of Customized Characters. CoRR abs/2402.13724 (2024)
[i142]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-07420
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-07420
Weijia Wu, Zhuang Li, Yuchao Gu, Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, Di Zhang:
DragAnything: Motion Control for Anything using Entity Representation. CoRR abs/2403.07420 (2024)
[i141]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2403-12728
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2403-12728
Jingtao Sun, Yaonan Wang, Mingtao Feng, Chao Ding, Mike Zheng Shou, Ajmal Saeed Mian:
Diffusion-Driven Self-Supervised Learning for Shape Reconstruction and Pose Estimation. CoRR abs/2403.12728 (2024)
[i140]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-02747
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-02747
Wentian Zhang, Haozhe Liu, Jinheng Xie, Francesco Faccio, Mike Zheng Shou, Jürgen Schmidhuber:
Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models. CoRR abs/2404.02747 (2024)
[i139]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-14055
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-14055
Hai Ci, Pei Yang, Yiren Song, Mike Zheng Shou:
RingID: Rethinking Tree-Ring Watermarking for Enhanced Multi-Key Identification. CoRR abs/2404.14055 (2024)
[i138]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-15909
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-15909
Jinheng Xie, Jiajun Feng, Zhaoxu Tian, Kevin Qinghong Lin, Yawen Huang, Xi Xia, Nanxu Gong, Xu Zuo, Jiaqi Yang, Yefeng Zheng, Mike Zheng Shou:
Learning Long-form Video Prior via Generative Pre-Training. CoRR abs/2404.15909 (2024)
[i137]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2404-18930
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2404-18930
Zechen Bai, Pichao Wang, Tianjun Xiao, Tong He, Zongbo Han, Zheng Zhang, Mike Zheng Shou:
Hallucination of Multimodal Large Language Models: A Survey. CoRR abs/2404.18930 (2024)
[i136]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-14974
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-14974
Henry Hengyuan Zhao, Pan Zhou, Difei Gao, Mike Zheng Shou:
LOVA3: Learning to Visual Question Answering, Asking and Assessment. CoRR abs/2405.14974 (2024)
[i135]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-19333
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-19333
Feipeng Ma, Hongwei Xue, Guangting Wang, Yizhou Zhou, Fengyun Rao, Shilin Yan, Yueyi Zhang, Siying Wu, Mike Zheng Shou, Xiaoyan Sun:
Multi-Modal Generative Embedding Model. CoRR abs/2405.19333 (2024)
[i134]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2405-20339
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2405-20339
Feipeng Ma, Hongwei Xue, Guangting Wang, Yizhou Zhou, Fengyun Rao, Shilin Yan, Yueyi Zhang, Siying Wu, Mike Zheng Shou, Xiaoyan Sun:
Visual Perception by Large Language Model's Weights. CoRR abs/2405.20339 (2024)
[i133]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-02547
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-02547
Alex Jinpeng Wang, Linjie Li, Yiqi Lin, Min Li, Lijuan Wang, Mike Zheng Shou:
Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning. CoRR abs/2406.02547 (2024)
[i132]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-06062
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-06062
Yiren Song, Shijie Huang, Chen Yao, Xiaojun Ye, Hai Ci, Jiaming Liu, Yuxuan Zhang, Mike Zheng Shou:
ProcessPainter: Learn Painting Process from Sequence Data. CoRR abs/2406.06062 (2024)
[i131]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-08337
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-08337
Hai Ci, Yiren Song, Pei Yang, Jinheng Xie, Mike Zheng Shou:
WMAdapter: Adding WaterMark Control to Latent Diffusion Models. CoRR abs/2406.08337 (2024)
[i130]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-09026
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-09026
Pei Yang, Hai Ci, Yiren Song, Mike Zheng Shou:
Steganalysis on Digital Watermarking: Is Your Defense Truly Impervious? CoRR abs/2406.09026 (2024)
[i129]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-10227
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-10227
Kevin Qinghong Lin, Linjie Li, Difei Gao, Qinchen Wu, Mingyi Yan, Zhengyuan Yang, Lijuan Wang, Mike Zheng Shou:
VideoGUI: A Benchmark for GUI Automation from Instructional Videos. CoRR abs/2406.10227 (2024)
[i128]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-11816
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-11816
Joya Chen, Zhaoyang Lv, Shiwei Wu, Kevin Qinghong Lin, Chenan Song, Difei Gao, Jia-Wei Liu, Ziteng Gao, Dongxing Mao, Mike Zheng Shou:
VideoLLM-online: Online Video Large Language Model for Streaming Video. CoRR abs/2406.11816 (2024)
[i127]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2406-13719
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2406-13719
Qinchen Wu, Difei Gao, Kevin Qinghong Lin, Zhuoyu Wu, Xiangwu Guo, Peiran Li, Weichen Zhang, Hengxu Wang, Mike Zheng Shou:
GUI Action Narrator: Where and When Did That Action Take Place? CoRR abs/2406.13719 (2024)
[i126]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-09521
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-09521
Yang Wang, Haiyang Mei, Qirui Bao, Ziqi Wei, Mike Zheng Shou, Haizhou Li, Bo Dong, Xin Yang:
Apprenticeship-Inspired Elegance: Synergistic Knowledge Distillation Empowers Spiking Neural Networks for Efficient Single-Eye Emotion Recognition. CoRR abs/2407.09521 (2024)
[i125]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2407-21757
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2407-21757
Kevin Qinghong Lin, Pengchuan Zhang, Difei Gao, Xide Xia, Joya Chen, Ziteng Gao, Jinheng Xie, Xuhong Xiao, Mike Zheng Shou:
Learning Video Context as Interleaved Multimodal Sequences. CoRR abs/2407.21757 (2024)
[i124]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-07249
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-07249
Zechen Bai, Tianjun Xiao, Tong He, Pichao Wang, Zheng Zhang, Thomas Brox, Mike Zheng Shou:
GQE: Generalized Query Expansion for Enhanced Text-Video Retrieval. CoRR abs/2408.07249 (2024)
[i123]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-12528
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-12528
Jinheng Xie, Weijia Mao, Zechen Bai, David Junhao Zhang, Weihao Wang, Kevin Qinghong Lin, Yuchao Gu, Zhijie Chen, Zhenheng Yang, Mike Zheng Shou:
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation. CoRR abs/2408.12528 (2024)
[i122]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2408-16730
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2408-16730
Shiwei Wu, Joya Chen, Kevin Qinghong Lin, Qimeng Wang, Yan Gao, Qianli Xu, Tong Xu, Yao Hu, Enhong Chen, Mike Zheng Shou:
VideoLLM-MoD: Efficient Video-Language Streaming with Mixture-of-Depths Vision Computation. CoRR abs/2408.16730 (2024)
[i121]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-19375
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-19375
Zongbo Han, Jialong Yang, Junfan Li, Qinghua Hu, Qianli Xu, Mike Zheng Shou, Changqing Zhang:
DOTA: Distributional Test-Time Adaptation of Vision-Language Models. CoRR abs/2409.19375 (2024)
[i120]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-19580
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-19580
Zhongcong Xu, Chaoyue Song, Guoxian Song, Jianfeng Zhang, Jun Hao Liew, Hongyi Xu, You Xie, Linjie Luo, Guosheng Lin, Jiashi Feng, Mike Zheng Shou:
High Quality Human Image Animation using Regional Supervision and Motion Blur Condition. CoRR abs/2409.19580 (2024)
[i119]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2409-19603
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2409-19603
Zechen Bai, Tong He, Haiyang Mei, Pichao Wang, Ziteng Gao, Joya Chen, Lei Liu, Zheng Zhang, Mike Zheng Shou:
One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos. CoRR abs/2409.19603 (2024)
[i118]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-03858
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-03858
Ziyu Wang, Shuangpeng Han, Mike Zheng Shou, Mengmi Zhang:
Unsupervised Prior Learning: Discovering Categorical Pose Priors from Videos. CoRR abs/2410.03858 (2024)
[i117]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-05470
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-05470
Yepeng Liu, Yiren Song, Hai Ci, Yu Zhang, Haofan Wang, Mike Zheng Shou, Yuheng Bu:
Image Watermarks are Removable Using Controllable Regeneration from Clean Noise. CoRR abs/2410.05470 (2024)
[i116]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-07133
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-07133
Rui Zhao, Hangjie Yuan, Yujie Wei, Shiwei Zhang, Yuchao Gu, Lingmin Ran, Xiang Wang, Jay Zhangjie Wu, Junhao Zhang, Yingya Zhang, Mike Zheng Shou:
EvolveDirector: Approaching Advanced Text-to-Image Generation with Large Vision-Language Models. CoRR abs/2410.07133 (2024)
[i115]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-09592
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-09592
Hongbin Xu, Weitao Chen, Zhipeng Zhou, Feng Xiao, Baigui Sun, Mike Zheng Shou, Wenxiong Kang:
ControLRM: Fast and Controllable 3D Generation via Large Reconstruction Model. CoRR abs/2410.09592 (2024)
[i114]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2410-20986
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2410-20986
Zijie Ye, Jia-Wei Liu, Jia Jia, Shikun Sun, Mike Zheng Shou:
Skinned Motion Retargeting with Dense Geometric Interaction Perception. CoRR abs/2410.20986 (2024)
[i113]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-05003
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-05003
David Junhao Zhang, Roni Paiss, Shiran Zada, Nikhil Karnad, David E. Jacobs, Yael Pritch, Inbar Mosseri, Mike Zheng Shou, Neal Wadhwa, Nataniel Ruiz:
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning. CoRR abs/2411.05003 (2024)
[i112]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-10323
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-10323
Siyuan Hu, Mingyu Ouyang, Difei Gao, Mike Zheng Shou:
The Dawn of GUI Agent: A Preliminary Case Study with Claude 3.5 Computer Use. CoRR abs/2411.10323 (2024)
[i111]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-14717
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-14717
Binqian Xu, Xiangbo Shu, Haiyang Mei, Guosen Xie, Basura Fernando, Mike Zheng Shou, Jinhui Tang:
FedMLLM: Federated Fine-tuning MLLM on Multimodal Heterogeneity Data. CoRR abs/2411.14717 (2024)
[i110]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-15262
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-15262
Weijia Wu, Mingyu Liu, Zeyu Zhu, Xi Xia, Haoen Feng, Wen Wang, Kevin Qinghong Lin, Chunhua Shen, Mike Zheng Shou:
MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation. CoRR abs/2411.15262 (2024)
[i109]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-16681
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-16681
Zechen Bai, Jianxiong Gao, Ziteng Gao, Pichao Wang, Zheng Zhang, Tong He, Mike Zheng Shou:
Factorized Visual Tokenization and Generation. CoRR abs/2411.16681 (2024)
[i108]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-17465
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-17465
Kevin Qinghong Lin, Linjie Li, Difei Gao, Zhengyuan Yang, Shiwei Wu, Zechen Bai, Weixian Lei, Lijuan Wang, Mike Zheng Shou:
ShowUI: One Vision-Language-Action Model for GUI Visual Agent. CoRR abs/2411.17465 (2024)
[i107]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2411-17949
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2411-17949
Yuchao Gu, Yipin Zhou, Yunfan Ye, Yixin Nie, Licheng Yu, Pingchuan Ma, Kevin Qinghong Lin, Mike Zheng Shou:
ROICtrl: Boosting Instance Control for Visual Generation. CoRR abs/2411.17949 (2024)
[i106]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-05980
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-05980
Yiren Song, Shengtao Lou, Xiaokang Liu, Hai Ci, Pei Yang, Jiaming Liu, Mike Zheng Shou:
Anti-Reference: Universal and Immediate Defense Against Reference-Based Generation. CoRR abs/2412.05980 (2024)
[i105]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-11621
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-11621
Muhammet Furkan Ilaslan, Ali Koksal, Kevin Qinhong Lin, Burak Satar, Mike Zheng Shou, Qianli Xu:
VG-TVP: Multimodal Procedural Planning via Visually Grounded Text-Video Prompting. CoRR abs/2412.11621 (2024)
[i104]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-11638
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-11638
Yiren Song, Pei Yang, Hai Ci, Mike Zheng Shou:
IDProtector: An Adversarial Noise Encoder to Protect Against ID-Preserving Image Generation. CoRR abs/2412.11638 (2024)
[i103]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2412-14580
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2412-14580
Yiren Song, Xiaokang Liu, Mike Zheng Shou:
DiffSim: Taming Diffusion Models for Evaluating Visual Similarity. CoRR abs/2412.14580 (2024)
2023
[j2]
- view
  authority control:
- export record
  dblp key:
  - journals/tip/WangCZYLWS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tip/WangCZYLWS23
Wenqian Wang, Faliang Chang, Junhao Zhang, Rui Yan, Chunsheng Liu, Bin Wang, Mike Zheng Shou:
Magi-Net: Meta Negative Network for Early Activity Prediction. IEEE Trans. Image Process. 32: 3254-3265 (2023)
[c59]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/LeiGWWLZS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/LeiGWWLZS23
Stan Weixian Lei, Difei Gao, Jay Zhangjie Wu, Yuxuan Wang, Wei Liu, Mengmi Zhang, Mike Zheng Shou:
Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task. AAAI 2023: 1250-1259
[c58]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/YanSGW0C023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/YanSGW0C023
Rui Yan, Mike Zheng Shou, Yixiao Ge, Jinpeng Wang, Xudong Lin, Guanyu Cai, Jinhui Tang:
Video-Text Pre-training with Learned Regions for Retrieval. AAAI 2023: 3100-3108
[c57]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/aaai/ZhangSGXWYSS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/aaai/ZhangSGXWYSS23
Binjie Zhang, Shupeng Su, Yixiao Ge, Xuyuan Xu, Yexin Wang, Chun Yuan, Mike Zheng Shou, Ying Shan:
Darwinian Model Upgrades: Model Evolving with Selective Compatibility. AAAI 2023: 3393-3400
[c56]
- view
  authority control:
- export record
  dblp key:
  - conf/acl/HouZ0GYCNSD23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/acl/HouZ0GYCNSD23
Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing Kwong Chan, Chong-Wah Ngo, Mike Zheng Shou, Nan Duan:
CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding. ACL (1) 2023: 8013-8028
[c55]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ChangWWFS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ChangWWFS23
Shuning Chang, Pichao Wang, Fan Wang, Jiashi Feng, Mike Zheng Shou:
DOAD: Decoupled One Stage Action Detection Network. CVPR Workshops 2023: 3123-3232
[c54]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ChangWLWZ0S23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ChangWLWZ0S23
Shuning Chang, Pichao Wang, Ming Lin, Fan Wang, David Junhao Zhang, Rong Jin, Mike Zheng Shou:
Making Vision Transformers Efficient from A Token Sparsification View. CVPR 2023: 6195-6205
[c53]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/WangGYGLT0CWSQS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/WangGYGLT0CWSQS23
Jinpeng Wang, Yixiao Ge, Rui Yan, Yuying Ge, Kevin Qinghong Lin, Satoshi Tsutsui, Xudong Lin, Guanyu Cai, Jianping Wu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
All in One: Exploring Unified Video-Language Pre-Training. CVPR 2023: 6598-6608
[c52]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ChenGLS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ChenGLS23
Joya Chen, Difei Gao, Kevin Qinghong Lin, Mike Zheng Shou:
Affordance Grounding from Demonstration Video to Target Image. CVPR 2023: 6799-6808
[c51]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/GaoZ0ZYS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/GaoZ0ZYS23
Difei Gao, Luowei Zhou, Lei Ji, Linchao Zhu, Yi Yang, Mike Zheng Shou:
MIST : Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering. CVPR 2023: 14773-14783
[c50]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/0003THLSJC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/0003THLSJC23
Xudong Lin, Simran Tiwari, Shiyuan Huang, Manling Li, Mike Zheng Shou, Heng Ji, Shih-Fu Chang:
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval. CVPR 2023: 14846-14855
[c49]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/WangZSY23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/WangZSY23
Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan:
Position-Guided Text Prompt for Vision-Language Pre-Training. CVPR 2023: 23242-23251
[c48]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/IlaslanSCGLXLS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/IlaslanSCGLXLS23
Muhammet Furkan Ilaslan, Chenan Song, Joya Chen, Difei Gao, Weixian Lei, Qianli Xu, Joo Lim, Mike Zheng Shou:
GazeVQA: A Video Question Answering Dataset for Multiview Eye-Gaze Task-Oriented Collaborations. EMNLP 2023: 10462-10479
[c47]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/WuZSZS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/WuZSZS23
Weijia Wu, Yuzhong Zhao, Mike Zheng Shou, Hong Zhou, Chunhua Shen:
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models. ICCV 2023: 1206-1217
[c46]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/LinZCPGWYS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/LinZCPGWYS23
Kevin Qinghong Lin, Pengchuan Zhang, Joya Chen, Shraman Pramanick, Difei Gao, Alex Jinpeng Wang, Rui Yan, Mike Zheng Shou:
UniVTG: Towards Unified Video-Language Temporal Grounding. ICCV 2023: 2782-2792
[c45]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/WangLZLS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/WangLZLS23
Alex Jinpeng Wang, Kevin Qinghong Lin, David Junhao Zhang, Stan Weixian Lei, Mike Zheng Shou:
Too Large; Data Reduction for Vision-Language Pre-Training. ICCV 2023: 3124-3134
[c44]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/LiXFZLLLKSY23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/LiXFZLLLKSY23
Ming Li, Xiangyu Xu, Hehe Fan, Pan Zhou, Jun Liu, Jia-Wei Liu, Jiahe Li, Jussi Keppo, Mike Zheng Shou, Shuicheng Yan:
STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition. ICCV 2023: 5083-5092
[c43]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/PramanickSNLSSC23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/PramanickSNLSSC23
Shraman Pramanick, Yale Song, Sayan Nag, Kevin Qinghong Lin, Hardik Shah, Mike Zheng Shou, Rama Chellappa, Pengchuan Zhang:
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone. ICCV 2023: 5262-5274
[c42]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/XieLHLZ0S23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/XieLHLZ0S23
Jinheng Xie, Yuexiang Li, Yawen Huang, Haozhe Liu, Wentian Zhang, Yefeng Zheng, Mike Zheng Shou:
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion. ICCV 2023: 7418-7427
[c41]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/WuGWLGSHSQS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/WuGWLGSHSQS23
Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Stan Weixian Lei, Yuchao Gu, Yufei Shi, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation. ICCV 2023: 7589-7599
[c40]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/SinghLSLGT0SKZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/SinghLSLGT0SKZ23
Parantak Singh, You Li, Ankur Sikarwar, Weixian Lei, Difei Gao, Morgan B. Talbot, Ying Sun, Mike Zheng Shou, Gabriel Kreiman, Mengmi Zhang:
Learning to Learn: How to Continuously Teach Humans and Machines. ICCV 2023: 11674-11685
[c39]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/FanBXZHZSSLSBZF23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/FanBXZHZSSLSBZF23
Ke Fan, Zechen Bai, Tianjun Xiao, Dominik Zietlow, Max Horn, Zixu Zhao, Carl-Johann Simon-Gabriel, Mike Zheng Shou, Francesco Locatello, Bernt Schiele, Thomas Brox, Zheng Zhang, Yanwei Fu, Tong He:
Unsupervised Open-Vocabulary Object Localization in Videos. ICCV 2023: 13701-13709
[c38]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/LiuCYXKSQS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/LiuCYXKSQS23
Jia-Wei Liu, Yan-Pei Cao, Tianyuan Yang, Zhongcong Xu, Jussi Keppo, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video. ICCV 2023: 18437-18448
[c37]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/WuZHZS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/WuZHZS23
Jay Zhangjie Wu, David Junhao Zhang, Wynne Hsu, Mengmi Zhang, Mike Zheng Shou:
Label-Efficient Online Continual Object Detection in Streaming Video. ICCV 2023: 19189-19198
[c36]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/ChangWLWS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/ChangWLWS23
Shuning Chang, Pichao Wang, Hao Luo, Fan Wang, Mike Zheng Shou:
Revisiting Vision Transformer from the View of Path Ensemble. ICCV 2023: 19832-19842
[c35]
- view
  authority control:
- export record
  dblp key:
  - conf/icdar/WuZLLSPKB23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icdar/WuZLLSPKB23
Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Mike Zheng Shou, Umapada Pal, Dimosthenis Karatzas, Xiang Bai:
ICDAR 2023 Competition on Video Text Reading for Dense and Small Text. ICDAR (2) 2023: 405-419
[c34]
- view
  authority control:
- export record
  dblp key:
  - conf/icde/Ooi0STTXYZ023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/icde/Ooi0STTXYZ023
Beng Chin Ooi, Gang Chen, Mike Zheng Shou, Kian-Lee Tan, Anthony K. H. Tung, Xiaokui Xiao, James Wei Luen Yip, Bingxue Zhang, Meihui Zhang:
The Metaverse Data Deluge: What Can We Do About It? ICDE 2023: 3675-3687
[c33]
- view
  - electronic edition @ openreview.net (open access)
  - details & citations
- export record
  dblp key:
  - conf/iclr/XuZLZBFS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iclr/XuZLZBFS23
Eric Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Wenqing Zhang, Song Bai, Jiashi Feng, Mike Zheng Shou:
PV3D: A 3D Generative Model for Portrait Video Generation. ICLR 2023
[c32]
- view
  authority control:
- export record
  dblp key:
  - conf/lgm3a/Shou23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/lgm3a/Shou23
Mike Zheng Shou:
Large Generative Models Meet Multimodal Video Intelligence. LGM3A@MM 2023: 1
[c31]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/XueYL0T0YSS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/XueYL0T0YSS23
Xizhe Xue, Dongdong Yu, Lingqiao Liu, Yu Liu, Satoshi Tsutsui, Ying Li, Zehuan Yuan, Ping Song, Mike Zheng Shou:
Transformer-based Open-world Instance Segmentation with Cross-task Consistency Regularization. ACM Multimedia 2023: 2507-2515
[c30]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/GuWWSCFXZCWGSS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GuWWSCFXZCWGSS23
Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan, Mike Zheng Shou:
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models. NeurIPS 2023
[c29]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/WangSZ23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WangSZ23
Ziyu Wang, Mike Zheng Shou, Mengmi Zhang:
Object-centric Learning with Cyclic Walks between Parts and Whole. NeurIPS 2023
[c28]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/WuZCGZHZSS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/WuZCGZHZSS23
Weijia Wu, Yuzhong Zhao, Hao Chen, Yuchao Gu, Rui Zhao, Yefei He, Hong Zhou, Mike Zheng Shou, Chunhua Shen:
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models. NeurIPS 2023
[c27]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/Xie0LLL0SS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/Xie0LLL0SS23
Jinheng Xie, Kai Ye, Yudong Li, Yuexiang Li, Kevin Qinghong Lin, Yefeng Zheng, Linlin Shen, Mike Zheng Shou:
Learning Visual Prior via Generative Pre-Training. NeurIPS 2023
[c26]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/XuZLFS23
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/XuZLFS23
Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Jiashi Feng, Mike Zheng Shou:
XAGen: 3D Expressive Human Avatars Generation. NeurIPS 2023
[i102]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2301-03046
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2301-03046
Ming Li, Jun Liu, Hehe Fan, Jiawei Liu, Jiahe Li, Mike Zheng Shou, Jussi Keppo:
STPrivacy: Spatio-Temporal Tubelet Sparsification and Anonymization for Privacy-preserving Action Recognition. CoRR abs/2301.03046 (2023)
[i101]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2302-08023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2302-08023
Ziyu Wang, Mike Zheng Shou, Mengmi Zhang:
Object-centric Learning with Cyclic Walks between Parts and Whole. CoRR abs/2302.08023 (2023)
[i100]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-01740
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-01740
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Zheng Qin, Mike Zheng Shou:
DeepfakeMAE: Facial Part Consistency Aware Masked Autoencoder for Deepfake Video Detection. CoRR abs/2303.01740 (2023)
[i99]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-07910
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-07910
Hengyuan Zhao, Hao Luo, Yuyang Zhao, Pichao Wang, Fan Wang, Mike Zheng Shou:
Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm. CoRR abs/2303.07910 (2023)
[i98]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-08685
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-08685
Shuning Chang, Pichao Wang, Ming Lin, Fan Wang, David Junhao Zhang, Rong Jin, Mike Zheng Shou:
Making Vision Transformers Efficient from A Token Sparsification View. CoRR abs/2303.08685 (2023)
[i97]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-11681
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-11681
Weijia Wu, Yuzhong Zhao, Mike Zheng Shou, Hong Zhou, Chunhua Shen:
DiffuMask: Synthesizing Images with Pixel-level Annotations for Semantic Segmentation Using Diffusion Models. CoRR abs/2303.11681 (2023)
[i96]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2303-14644
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2303-14644
Joya Chen, Difei Gao, Kevin Qinghong Lin, Mike Zheng Shou:
Affordance Grounding from Demonstration Video to Target Image. CoRR abs/2303.14644 (2023)
[i95]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-00254
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-00254
Shuning Chang, Pichao Wang, Fan Wang, Jiashi Feng, Mike Zheng Shou:
DOAD: Decoupled One Stage Action Detection Network. CoRR abs/2304.00254 (2023)
[i94]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-03768
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-03768
Ziteng Gao, Zhan Tong, Limin Wang, Mike Zheng Shou:
SparseFormer: Sparse Visual Recognition via Limited Latent Tokens. CoRR abs/2304.03768 (2023)
[i93]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-04023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-04023
Binqian Xu, Xiangbo Shu, Rui Yan, Guo-Sen Xie, Yixiao Ge, Mike Zheng Shou:
Attack is Good Augmentation: Towards Skeleton-Contrastive Representation Learning. CoRR abs/2304.04023 (2023)
[i92]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-04376
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-04376
Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Mike Zheng Shou, Umapada Pal, Dimosthenis Karatzas, Xiang Bai:
ICDAR 2023 Video Text Reading Competition for Dense and Small Text. CoRR abs/2304.04376 (2023)
[i91]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-08271
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-08271
Jinheng Xie, Zhaochuan Luo, Yuexiang Li, Haozhe Liu, Linlin Shen, Mike Zheng Shou:
Open-World Weakly-Supervised Object Localization. CoRR abs/2304.08271 (2023)
[i90]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2304-12281
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2304-12281
Jiawei Liu, Yan-Pei Cao, Tianyuan Yang, Eric Zhongcong Xu, Jussi Keppo, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
HOSNeRF: Dynamic Human-Object-Scene Neural Radiance Fields from a Single Video. CoRR abs/2304.12281 (2023)
[i89]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-03347
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-03347
Weijia Wu, Yuzhong Zhao, Zhuang Li, Jiahong Li, Hong Zhou, Mike Zheng Shou, Xiang Bai:
A Large Cross-Modal Video Retrieval Dataset with Reading Comprehension. CoRR abs/2305.03347 (2023)
[i88]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-05943
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-05943
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou:
Mover: Mask and Recovery based Facial Part Consistency Aware Method for Deepfake Video Detection. CoRR abs/2305.05943 (2023)
[i87]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-13777
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-13777
Jinheng Xie, Kai Ye, Yudong Li, Yuexiang Li, Kevin Qinghong Lin, Yefeng Zheng, Linlin Shen, Mike Zheng Shou:
VisorGPT: Learning Visual Prior via Generative Pre-Training. CoRR abs/2305.13777 (2023)
[i86]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-18292
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-18292
Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan, Mike Zheng Shou:
Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models. CoRR abs/2305.18292 (2023)
[i85]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2305-20087
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2305-20087
Alex Jinpeng Wang, Kevin Qinghong Lin, David Junhao Zhang, Stan Weixian Lei, Mike Zheng Shou:
Too Large; Data Reduction for Vision-Language Pre-Training. CoRR abs/2305.20087 (2023)
[i84]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-08640
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-08640
Difei Gao, Lei Ji, Luowei Zhou, Kevin Qinghong Lin, Joya Chen, Zihan Fan, Mike Zheng Shou:
AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn. CoRR abs/2306.08640 (2023)
[i83]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-12642
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-12642
Binjie Zhang, Yixiao Ge, Xuyuan Xu, Ying Shan, Mike Zheng Shou:
TaCA: Upgrading Your Visual Foundation Model with Task-agnostic Compatible Adapter. CoRR abs/2306.12642 (2023)
[i82]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2306-15255
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2306-15255
Zhijian Hou, Lei Ji, Difei Gao, Wanjun Zhong, Kun Yan, Chao Li, Wing-Kwong Chan, Chong-Wah Ngo, Nan Duan, Mike Zheng Shou:
GroundNLQ @ Ego4D Natural Language Queries Challenge 2023. CoRR abs/2306.15255 (2023)
[i81]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-05463
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-05463
Shraman Pramanick, Yale Song, Sayan Nag, Kevin Qinghong Lin, Hardik Shah, Mike Zheng Shou, Rama Chellappa, Pengchuan Zhang:
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone. CoRR abs/2307.05463 (2023)
[i80]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-10816
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-10816
Jinheng Xie, Yuexiang Li, Yawen Huang, Haozhe Liu, Wentian Zhang, Yefeng Zheng, Mike Zheng Shou:
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion. CoRR abs/2307.10816 (2023)
[i79]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2307-16715
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2307-16715
Kevin Qinghong Lin, Pengchuan Zhang, Joya Chen, Shraman Pramanick, Difei Gao, Alex Jinpeng Wang, Rui Yan, Mike Zheng Shou:
UniVTG: Towards Unified Video-Language Temporal Grounding. CoRR abs/2307.16715 (2023)
[i78]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-06160
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-06160
Weijia Wu, Yuzhong Zhao, Hao Chen, Yuchao Gu, Rui Zhao, Yefei He, Hong Zhou, Mike Zheng Shou, Chunhua Shen:
DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion Models. CoRR abs/2308.06160 (2023)
[i77]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-06548
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-06548
Shuning Chang, Pichao Wang, Hao Luo, Fan Wang, Mike Zheng Shou:
Revisiting Vision Transformer from the View of Path Ensemble. CoRR abs/2308.06548 (2023)
[i76]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-06739
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-06739
David Junhao Zhang, Mutian Xu, Chuhui Xue, Wenqing Zhang, Xiaoguang Han, Song Bai, Mike Zheng Shou:
Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks. CoRR abs/2308.06739 (2023)
[i75]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-09921
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-09921
Juan Hu, Xin Liao, Difei Gao, Satoshi Tsutsui, Qian Wang, Zheng Qin, Mike Zheng Shou:
Recap: Detecting Deepfake Video with Unpredictable Tampered Traces via Recovering Faces and Mapping Recovered Faces. CoRR abs/2308.09921 (2023)
[i74]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2308-10185
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2308-10185
Weixian Lei, Yixiao Ge, Jianfeng Zhang, Dylan Sun, Kun Yi, Ying Shan, Mike Zheng Shou:
ViT-Lens: Towards Omni-modal Representations. CoRR abs/2308.10185 (2023)
[i73]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-07698
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-07698
David Junhao Zhang, Heng Wang, Chuhui Xue, Rui Yan, Wenqing Zhang, Song Bai, Mike Zheng Shou:
Dataset Condensation via Generative Model. CoRR abs/2309.07698 (2023)
[i72]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-08513
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-08513
Henry Hengyuan Zhao, Pichao Wang, Yuyang Zhao, Hao Luo, Fan Wang, Mike Zheng Shou:
SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels. CoRR abs/2309.08513 (2023)
[i71]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-09469
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-09469
Zeyang Song, Jibin Wu, Malu Zhang, Mike Zheng Shou, Haizhou Li:
Spiking-LEAF: A Learnable Auditory front-end for Spiking Neural Networks. CoRR abs/2309.09469 (2023)
[i70]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-09858
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-09858
Ke Fan, Zechen Bai, Tianjun Xiao, Dominik Zietlow, Max Horn, Zixu Zhao, Carl-Johann Simon-Gabriel, Mike Zheng Shou, Francesco Locatello, Bernt Schiele, Thomas Brox, Zheng Zhang, Yanwei Fu, Tong He:
Unsupervised Open-Vocabulary Object Localization in Videos. CoRR abs/2309.09858 (2023)
[i69]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-12865
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-12865
Xizhe Xue, Haokui Zhang, Ying Li, Liuwei Wan, Zongwen Bai, Mike Zheng Shou:
Bridging Sensor Gaps via Single-Direction Tuning for Hyperspectral Image Classification. CoRR abs/2309.12865 (2023)
[i68]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2309-15818
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2309-15818
David Junhao Zhang, Jay Zhangjie Wu, Jia-Wei Liu, Rui Zhao, Lingmin Ran, Yuchao Gu, Difei Gao, Mike Zheng Shou:
Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation. CoRR abs/2309.15818 (2023)
[i67]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-08465
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-08465
Rui Zhao, Yuchao Gu, Jay Zhangjie Wu, David Junhao Zhang, Jiawei Liu, Weijia Wu, Jussi Keppo, Mike Zheng Shou:
MotionDirector: Motion Customization of Text-to-Video Diffusion Models. CoRR abs/2310.08465 (2023)
[i66]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-10624
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-10624
Jiawei Liu, Yan-Pei Cao, Jay Zhangjie Wu, Weijia Mao, Yuchao Gu, Rui Zhao, Jussi Keppo, Ying Shan, Mike Zheng Shou:
DynVideo-E: Harnessing Dynamic NeRF for Large-Scale Motion- and View-Change Human-Centric Video Editing. CoRR abs/2310.10624 (2023)
[i65]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-16002
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-16002
Jinbin Bai, Zhen Dong, Aosong Feng, Xiao Zhang, Tian Ye, Kaicheng Zhou, Mike Zheng Shou:
Integrating View Conditions for Image Synthesis. CoRR abs/2310.16002 (2023)
[i64]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2310-16003
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2310-16003
Jay Zhangjie Wu, Xiuyu Li, Difei Gao, Zhen Dong, Jinbin Bai, Aishani Singh, Xiaoyu Xiang, Youzeng Li, Zuwei Huang, Yuanxi Sun, Rui He, Feng Hu, Junhua Hu, Hai Huang, Hanyu Zhu, Xu Cheng, Jie Tang, Mike Zheng Shou, Kurt Keutzer, Forrest N. Iandola:
CVPR 2023 Text Guided Video Editing Competition. CoRR abs/2310.16003 (2023)
[i63]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-13574
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-13574
Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Jiashi Feng, Mike Zheng Shou:
XAGen: 3D Expressive Human Avatars Generation. CoRR abs/2311.13574 (2023)
[i62]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-14284
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-14284
Weijia Wu, Zhuang Li, Yefei He, Mike Zheng Shou, Chunhua Shen, Lele Cheng, Yan Li, Tingting Gao, Di Zhang, Zhongyuan Wang:
Paragraph-to-Image Generation with Information-Enriched Diffusion Model. CoRR abs/2311.14284 (2023)
[i61]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-16081
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-16081
Weixian Lei, Yixiao Ge, Kun Yi, Jianfeng Zhang, Difei Gao, Dylan Sun, Yuying Ge, Ying Shan, Mike Zheng Shou:
ViT-Lens-2: Gateway to Omni-modal Intelligence. CoRR abs/2311.16081 (2023)
[i60]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-16498
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-16498
Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Hanshu Yan, Jia-Wei Liu, Chenxu Zhang, Jiashi Feng, Mike Zheng Shou:
MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model. CoRR abs/2311.16498 (2023)
[i59]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-17450
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-17450
Weijia Wu, Yuzhong Zhao, Zhuang Li, Lianlei Shan, Hong Zhou, Mike Zheng Shou:
Continual Learning for Image Segmentation with Dynamic Query. CoRR abs/2311.17450 (2023)
[i58]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2311-18765
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2311-18765
Yanqing Liu, Kai Wang, Wenqi Shao, Ping Luo, Yu Qiao, Mike Zheng Shou, Kaipeng Zhang, Yang You:
MLLMs-Augmented Visual-Language Representation Learning. CoRR abs/2311.18765 (2023)
[i57]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-00583
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-00583
Bardienus Pieter Duisterhof, Zhao Mandi, Yunchao Yao, Jia-Wei Liu, Mike Zheng Shou, Shuran Song, Jeffrey Ichnowski:
MD-Splatting: Learning Metric Deformation from 4D Gaussians in Highly Deformable Scenes. CoRR abs/2312.00583 (2023)
[i56]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-01987
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-01987
Ziteng Gao, Zhan Tong, Kevin Qinghong Lin, Joya Chen, Mike Zheng Shou:
Bootstrapping SparseFormers from Vision Foundation Models. CoRR abs/2312.01987 (2023)
[i55]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-02015
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-02015
Yufei Shi, Beijia Lu, Jia-Wei Liu, Ming Li, Mike Zheng Shou:
ColonNeRF: Neural Radiance Fields for High-Fidelity Long-Sequence Colonoscopy Reconstruction. CoRR abs/2312.02015 (2023)
[i54]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-02087
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-02087
Yuchao Gu, Yipin Zhou, Bichen Wu, Licheng Yu, Jia-Wei Liu, Rui Zhao, Jay Zhangjie Wu, David Junhao Zhang, Mike Zheng Shou, Kevin Tang:
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence. CoRR abs/2312.02087 (2023)
[i53]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-02238
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-02238
Lingmin Ran, Xiaodong Cun, Jia-Wei Liu, Rui Zhao, Song Zijie, Xintao Wang, Jussi Keppo, Mike Zheng Shou:
X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model. CoRR abs/2312.02238 (2023)
[i52]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-06731
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-06731
Henry Hengyuan Zhao, Pan Zhou, Mike Zheng Shou:
Genixer: Empowering Multimodal Large Language Models as a Powerful Data Generator. CoRR abs/2312.06731 (2023)
[i51]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-11396
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-11396
Qi Mao, Lan Chen, Yuchao Gu, Zhen Fang, Mike Zheng Shou:
MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance. CoRR abs/2312.11396 (2023)
[i50]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-13108
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-13108
Difei Gao, Lei Ji, Zechen Bai, Mingyu Ouyang, Peiran Li, Dongxing Mao, Qinchen Wu, Weichen Zhang, Peiyi Wang, Xiangwu Guo, Hengxu Wang, Luowei Zhou, Mike Zheng Shou:
ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation. CoRR abs/2312.13108 (2023)
[i49]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-13324
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-13324
Weijia Mao, Yan-Pei Cao, Jia-Wei Liu, Zhongcong Xu, Mike Zheng Shou:
ShowRoom3D: Text to High-Quality 3D Room Generation Using 3D Priors. CoRR abs/2312.13324 (2023)
[i48]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2312-14232
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2312-14232
Yiqi Lin, Conghui He, Alex Jinpeng Wang, Bin Wang, Weijia Li, Mike Zheng Shou:
Parrot Captions Teach CLIP to Spot Text. CoRR abs/2312.14232 (2023)
2022
[j1]
- view
  authority control:
- export record
  dblp key:
  - journals/tip/CaoZCSZ22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/tip/CaoZCSZ22
Meng Cao, Can Zhang, Long Chen, Mike Zheng Shou, Yuexian Zou:
Deep Motion Prior for Weakly-Supervised Temporal Action Localization. IEEE Trans. Image Process. 31: 5203-5213 (2022)
[c25]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/WangGCY0SQS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/WangGCY0SQS22
Alex Jinpeng Wang, Yixiao Ge, Guanyu Cai, Rui Yan, Xudong Lin, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Object-aware Video-language Pre-training for Retrieval. CVPR 2022: 3303-3312
[c24]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/MaSZ0XYY22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/MaSZ0XYY22
Fan Ma, Mike Zheng Shou, Linchao Zhu, Haoqi Fan, Yilei Xu, Yi Yang, Zhicheng Yan:
Unified Transformer Tracker for Object Tracking. CVPR 2022: 8771-8780
[c23]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/GraumanWBCFGH0L22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/GraumanWBCFGH0L22
Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran K. Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Ziwei Zhao, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Christian Fuegen, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3, 000 Hours of Egocentric Video. CVPR 2022: 18973-18990
[c22]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/ZhangLWCCQLS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/ZhangLWCCQLS22
David Junhao Zhang, Kunchang Li, Yali Wang, Yunpeng Chen, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou:
MorphMLP: An Efficient MLP-Like Backbone for Spatial-Temporal Representation Learning. ECCV (35) 2022: 230-248
[c21]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/WongCWLMGS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/WongCWLMGS22
Benita Wong, Joya Chen, You Wu, Stan Weixian Lei, Dongxing Mao, Difei Gao, Mike Zheng Shou:
AssistQ: Affordance-Centric Question-Driven Task Completion for Egocentric Assistant. ECCV (36) 2022: 485-501
[c20]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/WangGYLFS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/WangGYLFS22
Yuxuan Wang, Difei Gao, Licheng Yu, Weixian Lei, Matt Feiszli, Mike Zheng Shou:
GEB+: A Benchmark for Generic Event Boundary Captioning, Grounding and Retrieval. ECCV (35) 2022: 709-725
[c19]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/LeiGWMLRS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/LeiGWMLRS22
Weixian Lei, Difei Gao, Yuxuan Wang, Dongxing Mao, Zihan Liang, Lingmin Ran, Mike Zheng Shou:
AssistSR: Task-oriented Video Segment Retrieval for Personal AI Assistant. EMNLP (Findings) 2022: 319-338
[c18]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/ChangWWLS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/ChangWWLS22
Shuning Chang, Pichao Wang, Fan Wang, Hao Li, Zheng Shou:
Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation. HCMA@MM 2022: 41-50
[c17]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/XuSTFYS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/XuSTFYS22
Eric Zhongcong Xu, Zeyang Song, Satoshi Tsutsui, Chao Feng, Mang Ye, Mike Zheng Shou:
AVA-AVD: Audio-visual Speaker Diarization in the Wild. ACM Multimedia 2022: 3838-3847
[c16]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/JinSZTQY22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/JinSZTQY22
Zan-Xia Jin, Mike Zheng Shou, Fang Zhou, Satoshi Tsutsui, Jingyan Qin, Xu-Cheng Yin:
From Token to Word: OCR Token Evolution via Contrastive Learning and Semantic Matching for Text-VQA. ACM Multimedia 2022: 4564-4572
[c15]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/LinWSWYXGTZKCWD22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LinWSWYXGTZKCWD22
Kevin Qinghong Lin, Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, Rong-Cheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou:
Egocentric Video-Language Pretraining. NeurIPS 2022
[c14]
- view
  - electronic edition @ nips.cc (open access)
  - details & citations
- export record
  dblp key:
  - conf/nips/LiuCMZZKSQS22
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/LiuCMZZKSQS22
Jiawei Liu, Yan-Pei Cao, Weijia Mao, Wenqiao Zhang, David Junhao Zhang, Jussi Keppo, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes. NeurIPS 2022
[i47]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-04203
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-04203
Benita Wong, Joya Chen, You Wu, Stan Weixian Lei, Dongxing Mao, Difei Gao, Mike Zheng Shou:
AssistQ: Affordance-centric Question-driven Task Completion for Egocentric Assistant. CoRR abs/2203.04203 (2022)
[i46]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-07303
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-07303
Alex Jinpeng Wang, Yixiao Ge, Rui Yan, Yuying Ge, Xudong Lin, Guanyu Cai, Jianping Wu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
All in One: Exploring Unified Video-Language Pre-training. CoRR abs/2203.07303 (2022)
[i45]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-07720
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-07720
Guanyu Cai, Yixiao Ge, Alex Jinpeng Wang, Rui Yan, Xudong Lin, Ying Shan, Lianghua He, Xiaohu Qie, Jianping Wu, Mike Zheng Shou:
Revitalize Region Feature for Democratizing Video-Language Pre-training. CoRR abs/2203.07720 (2022)
[i44]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2203-15175
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2203-15175
Fan Ma, Mike Zheng Shou, Linchao Zhu, Haoqi Fan, Yilei Xu, Yi Yang, Zhicheng Yan:
Unified Transformer Tracker for Object Tracking. CoRR abs/2203.15175 (2022)
[i43]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2204-00486
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2204-00486
Yuxuan Wang, Difei Gao, Licheng Yu, Stan Weixian Lei, Matt Feiszli, Mike Zheng Shou:
GEB+: A benchmark for generic event boundary captioning, grounding and text-based retrieval. CoRR abs/2204.00486 (2022)
[i42]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-15595
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-15595
Satoshi Tsutsui, Weijia Mao, Sijing Lin, Yunyi Zhu, Murong Ma, Mike Zheng Shou:
Novel View Synthesis for High-fidelity Headshot Scenes. CoRR abs/2205.15595 (2022)
[i41]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2205-15723
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2205-15723
Jiawei Liu, Yan-Pei Cao, Weijia Mao, Wenqiao Zhang, David Junhao Zhang, Jussi Keppo, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
DeVRF: Fast Deformable Voxel Radiance Fields for Dynamic Scenes. CoRR abs/2205.15723 (2022)
[i40]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-00309
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-00309
Jay Zhangjie Wu, David Junhao Zhang, Wynne Hsu, Mengmi Zhang, Mike Zheng Shou:
Label-Efficient Online Continual Object Detection in Streaming Video. CoRR abs/2206.00309 (2022)
[i39]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-01670
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-01670
Kevin Qinghong Lin, Alex Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, Rong-Cheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou:
Egocentric Video-Language Pretraining. CoRR abs/2206.01670 (2022)
[i38]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-02082
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-02082
Xudong Lin, Simran Tiwari, Shiyuan Huang, Manling Li, Mike Zheng Shou, Heng Ji, Shih-Fu Chang:
Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval. CoRR abs/2206.02082 (2022)
[i37]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2206-10326
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2206-10326
Beng Chin Ooi, Kian-Lee Tan, Anthony K. H. Tung, Gang Chen, Mike Zheng Shou, Xiaokui Xiao, Meihui Zhang:
Sense The Physical, Walkthrough The Virtual, Manage The Metaverse: A Data-centric Perspective. CoRR abs/2206.10326 (2022)
[i36]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-01334
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-01334
Kevin Qinghong Lin, Alex Jinpeng Wang, Rui Yan, Eric Zhongcong Xu, Rong-Cheng Tu, Yanru Zhu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Wei Liu, Mike Zheng Shou:
Egocentric Video-Language Pretraining @ EPIC-KITCHENS-100 Multi-Instance Retrieval Challenge 2022. CoRR abs/2207.01334 (2022)
[i35]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2207-01622
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2207-01622
Kevin Qinghong Lin, Alex Jinpeng Wang, Mattia Soldan, Michael Wray, Rui Yan, Eric Zhongcong Xu, Difei Gao, Rong-Cheng Tu, Wenzhe Zhao, Weijie Kong, Chengfei Cai, Hongfa Wang, Dima Damen, Bernard Ghanem, Wei Liu, Mike Zheng Shou:
Egocentric Video-Language Pretraining @ Ego4D Challenge 2022. CoRR abs/2207.01622 (2022)
[i34]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-09023
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-09023
Xizhe Xue, Dongdong Yu, Lingqiao Liu, Yu Liu, Ying Li, Zehuan Yuan, Ping Song, Mike Zheng Shou:
Single-Stage Open-world Instance Segmentation with Cross-task Consistency Regularization. CoRR abs/2208.09023 (2022)
[i33]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2208-12037
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2208-12037
Stan Weixian Lei, Difei Gao, Jay Zhangjie Wu, Yuxuan Wang, Wei Liu, Mengmi Zhang, Mike Zheng Shou:
Symbolic Replay: Scene Graph as Prompt for Continual Learning on VQA Task. CoRR abs/2208.12037 (2022)
[i32]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2209-10918
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2209-10918
Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan:
CONE: An Efficient COarse-to-fiNE Alignment Framework for Long Video Temporal Grounding. CoRR abs/2209.10918 (2022)
[i31]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2210-06954
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2210-06954
Binjie Zhang, Shupeng Su, Yixiao Ge, Xuyuan Xu, Yexin Wang, Chun Yuan, Mike Zheng Shou, Ying Shan:
Darwinian Model Upgrades: Model Evolving with Selective Compatibility. CoRR abs/2210.06954 (2022)
[i30]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-08776
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-08776
Zhijian Hou, Wanjun Zhong, Lei Ji, Difei Gao, Kun Yan, Wing Kwong Chan, Chong-Wah Ngo, Zheng Shou, Nan Duan:
An Efficient COarse-to-fiNE Alignment Framework @ Ego4D Natural Language Queries Challenge 2022. CoRR abs/2211.08776 (2022)
[i29]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2211-15470
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2211-15470
Parantak Singh, You Li, Ankur Sikarwar, Weixian Lei, Daniel Gao, Morgan Bruce Talbot, Ying Sun, Mike Zheng Shou, Gabriel Kreiman, Mengmi Zhang:
Learning to Learn: How to Continuously Teach Humans and Machines. CoRR abs/2211.15470 (2022)
[i28]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-03185
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-03185
Yuchao Gu, Xintao Wang, Yixiao Ge, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis. CoRR abs/2212.03185 (2022)
[i27]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-06384
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-06384
Eric Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Wenqing Zhang, Song Bai, Jiashi Feng, Mike Zheng Shou:
PV3D: A 3D Generative Model for Portrait Video Generation. CoRR abs/2212.06384 (2022)
[i26]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-09522
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-09522
Difei Gao, Luowei Zhou, Lei Ji, Linchao Zhu, Yi Yang, Mike Zheng Shou:
MIST: Multi-modal Iterative Spatial-Temporal Transformer for Long-form Video Question Answering. CoRR abs/2212.09522 (2022)
[i25]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-09737
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-09737
Alex Jinpeng Wang, Pan Zhou, Mike Zheng Shou, Shuicheng Yan:
Position-guided Text Prompt for Vision-Language Pre-training. CoRR abs/2212.09737 (2022)
[i24]
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - journals/corr/abs-2212-11565
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2212-11565
Jay Zhangjie Wu, Yixiao Ge, Xintao Wang, Weixian Lei, Yuchao Gu, Wynne Hsu, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation. CoRR abs/2212.11565 (2022)
2021
[c13]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/PanCSLS021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/PanCSLS021
Junting Pan, Siyu Chen, Mike Zheng Shou, Yu Liu, Jing Shao, Hongsheng Li:
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization. CVPR 2021: 464-474
[c12]
- view
  authority control:
- export record
  dblp key:
  - conf/emnlp/CaoCSZZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/emnlp/CaoCSZZ21
Meng Cao, Long Chen, Mike Zheng Shou, Can Zhang, Yuexian Zou:
On Pursuit of Designing Multi-modal Transformer for Video Grounding. EMNLP (1) 2021: 9810-9823
[c11]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/GongW0FWY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/GongW0FWY21
Xinyu Gong, Heng Wang, Zheng Shou, Matt Feiszli, Zhangyang Wang, Zhicheng Yan:
Searching for Two-Stream Models in Multivariate Space for Video Recognition. ICCV 2021: 8013-8022
[c10]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/ShouL0GF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/ShouL0GF21
Mike Zheng Shou, Stan Weixian Lei, Weiyao Wang, Deepti Ghadiyaram, Matt Feiszli:
Generic Event Boundary Detection: A Benchmark for Event Segmentation. ICCV 2021: 8055-8064
[c9]
- view
  authority control:
- export record
  dblp key:
  - conf/iccv/YeR0S21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/iccv/YeR0S21
Mang Ye, Weijian Ruan, Bo Du, Mike Zheng Shou:
Channel Augmented Joint Learning for Visible-Infrared Recognition. ICCV 2021: 13547-13556
[c8]
- view
  authority control:
- export record
  dblp key:
  - conf/mm/TaoPDQS021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/mm/TaoPDQS021
Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li:
Is Someone Speaking?: Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection. ACM Multimedia 2021: 3927-3935
[i23]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2101-10511
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2101-10511
Mike Zheng Shou, Deepti Ghadiyaram, Weiyao Wang, Matt Feiszli:
Generic Event Boundary Detection: A Benchmark for Event Segmentation. CoRR abs/2101.10511 (2021)
[i22]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2107-06592
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2107-06592
Ruijie Tao, Zexu Pan, Rohan Kumar Das, Xinyuan Qian, Mike Zheng Shou, Haizhou Li:
Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection. CoRR abs/2107.06592 (2021)
[i21]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2108-05607
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2108-05607
Meng Cao, Can Zhang, Long Chen, Mike Zheng Shou, Yuexian Zou:
Deep Motion Prior for Weakly-Supervised Temporal Action Localization. CoRR abs/2108.05607 (2021)
[i20]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2108-12957
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2108-12957
Xinyu Gong, Heng Wang, Zheng Shou, Matt Feiszli, Zhangyang Wang, Zhicheng Yan:
Searching for Two-Stream Models in Multivariate Space for Video Recognition. CoRR abs/2108.12957 (2021)
[i19]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2109-06085
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2109-06085
Meng Cao, Long Chen, Mike Zheng Shou, Can Zhang, Yuexian Zou:
On Pursuit of Designing Multi-modal Transformer for Video Grounding. CoRR abs/2109.06085 (2021)
[i18]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2110-07058
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2110-07058
Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Christian Fuegen, Abrham Gebreselasie, Cristina González, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jáchym Kolár, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran K. Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Yunyi Zhu, Pablo Arbeláez, David Crandall, Dima Damen, Giovanni Maria Farinella, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard A. Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik:
Ego4D: Around the World in 3, 000 Hours of Egocentric Video. CoRR abs/2110.07058 (2021)
[i17]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2111-12527
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-12527
David Junhao Zhang, Kunchang Li, Yunpeng Chen, Yali Wang, Shashwat Chandra, Yu Qiao, Luoqi Liu, Mike Zheng Shou:
MorphMLP: A Self-Attention Free, MLP-Like Backbone for Image and Video. CoRR abs/2111.12527 (2021)
[i16]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2111-14448
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-14448
Eric Zhongcong Xu, Zeyang Song, Chao Feng, Mang Ye, Mike Zheng Shou:
AVA-AVD: Audio-visual Speaker Diarization in the Wild. CoRR abs/2111.14448 (2021)
[i15]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2111-15050
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2111-15050
Stan Weixian Lei, Yuxuan Wang, Dongxing Mao, Difei Gao, Mike Zheng Shou:
AssistSR: Affordance-centric Question-driven Video Segment Retrieval. CoRR abs/2111.15050 (2021)
[i14]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2112-00656
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2112-00656
Alex Jinpeng Wang, Yixiao Ge, Guanyu Cai, Rui Yan, Xudong Lin, Ying Shan, Xiaohu Qie, Mike Zheng Shou:
Object-aware Video-language Pre-training for Retrieval. CoRR abs/2112.00656 (2021)
[i13]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2112-01194
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2112-01194
Rui Yan, Mike Zheng Shou, Yixiao Ge, Alex Jinpeng Wang, Xudong Lin, Guanyu Cai, Jinhui Tang:
Video-Text Pre-training with Learned Regions. CoRR abs/2112.01194 (2021)
2020
[c7]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/MaZYZKFS20
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/MaZYZKFS20
Fan Ma, Linchao Zhu, Yi Yang, Shengxin Zha, Gourab Kundu, Matt Feiszli, Zheng Shou:
SF-Net: Single-Frame Supervision for Temporal Action Localization. ECCV (4) 2020: 420-437
[i12]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2003-06845
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2003-06845
Fan Ma, Linchao Zhu, Yi Yang, Shengxin Zha, Gourab Kundu, Matt Feiszli, Zheng Shou:
SF-Net: Single-Frame Supervision for Temporal Action Localization. CoRR abs/2003.06845 (2020)
[i11]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-2006-07976
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-2006-07976
Junting Pan, Siyu Chen, Zheng Shou, Jing Shao, Hongsheng Li:
Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization. CoRR abs/2006.07976 (2020)

2010 – 2019

see FAQ

What is the meaning of the colors in the publication lists?

2019
[b1]
- view
  authority control:
- export record
  dblp key:
  - phd/us/Shou19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/phd/us/Shou19
Zheng Shou:
Deep Learning for Action Understanding in Video. Columbia University, USA, 2019
[c6]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ShouLKSRCY19
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ShouLKSRCY19
Zheng Shou, Xudong Lin, Yannis Kalantidis, Laura Sevilla-Lara, Marcus Rohrbach, Shih-Fu Chang, Zhicheng Yan:
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition. CVPR 2019: 1268-1277
[i10]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1901-03460
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1901-03460
Zheng Shou, Zhicheng Yan, Yannis Kalantidis, Laura Sevilla-Lara, Marcus Rohrbach, Xudong Lin, Shih-Fu Chang:
DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition. CoRR abs/1901.03460 (2019)
[i9]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1905-09904
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1905-09904
Jiawei Ma, Zheng Shou, Alireza Zareian, Hassan Mansour, Anthony Vetro, Shih-Fu Chang:
CDSA: Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation. CoRR abs/1905.09904 (2019)
[i8]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1910-11285
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1910-11285
Xudong Lin, Zheng Shou, Shih-Fu Chang:
LPAT: Learning to Predict Adaptive Threshold for Weakly-supervised Temporal Action Localization. CoRR abs/1910.11285 (2019)
2018
[c5]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/ShouGZMC18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/ShouGZMC18
Zheng Shou, Hang Gao, Lei Zhang, Kazuyuki Miyazawa, Shih-Fu Chang:
AutoLoc: Weakly-Supervised Temporal Action Localization in Untrimmed Videos. ECCV (16) 2018: 162-179
[c4]
- view
  authority control:
- export record
  dblp key:
  - conf/eccv/ShouPCMMVNC18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/eccv/ShouPCMMVNC18
Zheng Shou, Junting Pan, Jonathan Chan, Kazuyuki Miyazawa, Hassan Mansour, Anthony Vetro, Xavier Giró-i-Nieto, Shih-Fu Chang:
Online Detection of Action Start in Untrimmed, Streaming Videos. ECCV (3) 2018: 551-568
[c3]
- view
- export record
  dblp key:
  - conf/nips/GaoSZZC18
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/nips/GaoSZZC18
Hang Gao, Zheng Shou, Alireza Zareian, Hanwang Zhang, Shih-Fu Chang:
Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks. NeurIPS 2018: 983-993
[i7]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1802-06822
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1802-06822
Zheng Shou, Junting Pan, Jonathan Chan, Kazuyuki Miyazawa, Hassan Mansour, Anthony Vetro, Xavier Giró-i-Nieto, Shih-Fu Chang:
Online Action Detection in Untrimmed, Streaming Videos - Modeling and Evaluation. CoRR abs/1802.06822 (2018)
[i6]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1807-08333
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1807-08333
Zheng Shou, Hang Gao, Lei Zhang, Kazuyuki Miyazawa, Shih-Fu Chang:
AutoLoc: Weakly-supervised Temporal Action Localization. CoRR abs/1807.08333 (2018)
[i5]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1810-11730
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1810-11730
Hang Gao, Zheng Shou, Alireza Zareian, Hanwang Zhang, Shih-Fu Chang:
Low-shot Learning via Covariance-Preserving Adversarial Augmentation Networks. CoRR abs/1810.11730 (2018)
2017
[c2]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ShouCZMC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ShouCZMC17
Zheng Shou, Jonathan Chan, Alireza Zareian, Kazuyuki Miyazawa, Shih-Fu Chang:
CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos. CVPR 2017: 1417-1426
[i4]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/ShouCZMC17
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ShouCZMC17
Zheng Shou, Jonathan Chan, Alireza Zareian, Kazuyuki Miyazawa, Shih-Fu Chang:
CDC: Convolutional-De-Convolutional Networks for Precise Temporal Action Localization in Untrimmed Videos. CoRR abs/1703.01515 (2017)
[i3]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/abs-1708-05038
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/abs-1708-05038
Du Tran, Jamie Ray, Zheng Shou, Shih-Fu Chang, Manohar Paluri:
ConvNet Architecture Search for Spatiotemporal Feature Learning. CoRR abs/1708.05038 (2017)
2016
[c1]
- view
  authority control:
- export record
  dblp key:
  - conf/cvpr/ShouWC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/cvpr/ShouWC16
Zheng Shou, Dongang Wang, Shih-Fu Chang:
Temporal Action Localization in Untrimmed Videos via Multi-stage CNNs. CVPR 2016: 1049-1058
[i2]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/ShouWC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/ShouWC16
Zheng Shou, Dongang Wang, Shih-Fu Chang:
Action Temporal Localization in Untrimmed Videos via Multi-stage CNNs. CoRR abs/1601.02129 (2016)
[i1]
- view
  - electronic edition @ arxiv.org (open access)
  - details & citations
- export record
  dblp key:
  - journals/corr/WangSLC16
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/journals/corr/WangSLC16
Dongang Wang, Zheng Shou, Hongyi Liu, Shih-Fu Chang:
EventNet Version 1.1 Technical Report. CoRR abs/1605.07289 (2016)

Coauthor Index

see FAQ

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.