default search action
Cordelia Schmid
Person information
- affiliation: INRIA, France
- award (2023): Körber-Preis für die Europäische Wissenschaft
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j55]Pia Bideau, Erik G. Learned-Miller, Cordelia Schmid, Karteek Alahari:
The Right Spin: Learning Object Motion from Rotation-Compensated Flow Fields. Int. J. Comput. Vis. 132(1): 40-55 (2024) - [j54]Lucas Ventura, Antoine Yang, Cordelia Schmid, Gül Varol:
CoVR-2: Automatic Data Construction for Composed Video Retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 46(12): 11409-11421 (2024) - [j53]Quentin Le Lidec, Wilson Jallet, Louis Montaut, Ivan Laptev, Cordelia Schmid, Justin Carpentier:
Contact Models in Robotics: A Comparative Analysis. IEEE Trans. Robotics 40: 3716-3733 (2024) - [c274]Sai Kumar Dwivedi, Cordelia Schmid, Hongwei Yi, Michael J. Black, Dimitrios Tzionas:
POCO: 3D Pose and Shape Estimation with Confidence. 3DV 2024: 85-95 - [c273]Lucas Ventura, Antoine Yang, Cordelia Schmid, Gül Varol:
CoVR: Learning Composed Video Retrieval from Web Video Captions. AAAI 2024: 5270-5279 - [c272]Otniel-Bogdan Mercea, Alexey A. Gritsenko, Cordelia Schmid, Anurag Arnab:
Time-, Memory- and Parameter-Efficient Visual Adaptation. CVPR 2024: 5536-5545 - [c271]Jiarui Xu, Xingyi Zhou, Shen Yan, Xiuye Gu, Anurag Arnab, Chen Sun, Xiaolong Wang, Cordelia Schmid:
Pixel Aligned Language Models. CVPR 2024: 13030-13039 - [c270]Juhong Min, Shyamal Buch, Arsha Nagrani, Minsu Cho, Cordelia Schmid:
MoReVQA: Exploring Modular Reasoning Models for Video Question Answering. CVPR 2024: 13235-13245 - [c269]Mathilde Caron, Ahmet Iscen, Alireza Fathi, Cordelia Schmid:
A Generative Approach for Wikipedia-Scale Visual Entity Recognition. CVPR 2024: 17313-17322 - [c268]Shizhe Chen, Ricardo Garcia, Ivan Laptev, Cordelia Schmid:
SUGAR : Pre-training 3D Visual Representations for Robotics. CVPR 2024: 18049-18060 - [c267]Xingyi Zhou, Anurag Arnab, Shyamal Buch, Shen Yan, Austin Myers, Xuehan Xiong, Arsha Nagrani, Cordelia Schmid:
Streaming Dense Video Captioning. CVPR 2024: 18243-18252 - [c266]Alexey A. Gritsenko, Xuehan Xiong, Josip Djolonga, Mostafa Dehghani, Chen Sun, Mario Lucic, Cordelia Schmid, Anurag Arnab:
End-to-End Spatio-Temporal Action Localisation with Video Transformers. CVPR 2024: 18373-18383 - [c265]Manjin Kim, Paul Hongsuck Seo, Cordelia Schmid, Minsu Cho:
Learning Correlation Structures for Vision Transformers. CVPR 2024: 18941-18951 - [c264]Guillaume Le Moing, Jean Ponce, Cordelia Schmid:
Dense Optical Tracking: Connecting the Dots. CVPR 2024: 19187-19197 - [c263]Jae Myung Kim, Jessica Bader, Stephan Alaniz, Cordelia Schmid, Zeynep Akata:
DataDream: Few-Shot Guided Dataset Generation. ECCV (71) 2024: 252-268 - [c262]Ahmet Iscen, Mathilde Caron, Alireza Fathi, Cordelia Schmid:
Retrieval-Enhanced Contrastive Vision-Text Models. ICLR 2024 - [c261]Ziniu Hu, Ahmet Iscen, Aashi Jain, Thomas Kipf, Yisong Yue, David A. Ross, Cordelia Schmid, Alireza Fathi:
SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code. ICML 2024 - [c260]Mathilde Caron, Neil Houlsby, Cordelia Schmid:
Location-Aware Self-Supervised Transformers for Semantic Segmentation. WACV 2024: 116-126 - [i189]Partha Ghosh, Soubhik Sanyal, Cordelia Schmid, Bernhard Schölkopf:
RAVEN: Rethinking Adversarial Video Generation with Efficient Tri-plane Networks. CoRR abs/2401.06035 (2024) - [i188]Otniel-Bogdan Mercea, Alexey A. Gritsenko, Cordelia Schmid, Anurag Arnab:
Time-, Memory- and Parameter-Efficient Visual Adaptation. CoRR abs/2402.02887 (2024) - [i187]Ziniu Hu, Ahmet Iscen, Aashi Jain, Thomas Kipf, Yisong Yue, David A. Ross, Cordelia Schmid, Alireza Fathi:
SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code. CoRR abs/2403.01248 (2024) - [i186]Mathilde Caron, Ahmet Iscen, Alireza Fathi, Cordelia Schmid:
A Generative Approach for Wikipedia-Scale Visual Entity Recognition. CoRR abs/2403.02041 (2024) - [i185]Xingyi Zhou, Anurag Arnab, Shyamal Buch, Shen Yan, Austin Myers, Xuehan Xiong, Arsha Nagrani, Cordelia Schmid:
Streaming Dense Video Captioning. CoRR abs/2404.01297 (2024) - [i184]Shizhe Chen, Ricardo Garcia, Ivan Laptev, Cordelia Schmid:
SUGAR: Pre-training 3D Visual Representations for Robotics. CoRR abs/2404.01491 (2024) - [i183]Manjin Kim, Paul Hongsuck Seo, Cordelia Schmid, Minsu Cho:
Learning Correlation Structures for Vision Transformers. CoRR abs/2404.03924 (2024) - [i182]Juhong Min, Shyamal Buch, Arsha Nagrani, Minsu Cho, Cordelia Schmid:
MoReVQA: Exploring Modular Reasoning Models for Video Question Answering. CoRR abs/2404.06511 (2024) - [i181]Zerui Chen, Shizhe Chen, Cordelia Schmid, Ivan Laptev:
ViViDex: Learning Vision-based Dexterous Manipulation from Human Videos. CoRR abs/2404.15709 (2024) - [i180]Lucas Ventura, Cordelia Schmid, Gül Varol:
Learning text-to-video retrieval from image captioning. CoRR abs/2404.17498 (2024) - [i179]Riccardo Cadei, Lukas Lindorfer, Sylvia Cremer, Cordelia Schmid, Francesco Locatello:
Smoke and Mirrors in Causal Downstream Tasks. CoRR abs/2405.17151 (2024) - [i178]Matthieu Futeral, Armel Zebaze, Pedro Ortiz Suarez, Julien Abadji, Rémi Lacroix, Cordelia Schmid, Rachel Bawden, Benoît Sagot:
mOSCAR: A Large-scale Multilingual and Multimodal Document-level Corpus. CoRR abs/2406.08707 (2024) - [i177]Jae Myung Kim, Jessica Bader, Stephan Alaniz, Cordelia Schmid, Zeynep Akata:
DataDream: Few-shot Guided Dataset Generation. CoRR abs/2407.10910 (2024) - [i176]Matthieu Futeral, Cordelia Schmid, Benoît Sagot, Rachel Bawden:
Towards Zero-Shot Multimodal Machine Translation. CoRR abs/2407.13579 (2024) - [i175]Ricardo Garcia, Shizhe Chen, Cordelia Schmid:
Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy. CoRR abs/2410.01345 (2024) - [i174]Mathilde Caron, Alireza Fathi, Cordelia Schmid, Ahmet Iscen:
Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach. CoRR abs/2410.23676 (2024) - 2023
- [c259]Sanjay Subramanian, Medhini Narasimhan, Kushal Khangaonkar, Kevin Yang, Arsha Nagrani, Cordelia Schmid, Andy Zeng, Trevor Darrell, Dan Klein:
Modular Visual Question Answering via Code Generation. ACL (2) 2023: 747-761 - [c258]Matthieu Futeral, Cordelia Schmid, Ivan Laptev, Benoît Sagot, Rachel Bawden:
Tackling Ambiguity with Images: Improved Multimodal Machine Translation and Contrastive Evaluation. ACL (1) 2023: 5394-5413 - [c257]Shizhe Chen, Ricardo Garcia, Cordelia Schmid, Ivan Laptev:
PolarNet: 3D Point Clouds for Language-Guided Robotic Manipulation. CoRL 2023: 1761-1781 - [c256]Xingyi Zhou, Anurag Arnab, Chen Sun, Cordelia Schmid:
How can objects help action recognition? CVPR 2023: 2353-2362 - [c255]Jae-Myung Kim, A. Sophia Koepke, Cordelia Schmid, Zeynep Akata:
Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval. CVPR Workshops 2023: 2585-2595 - [c254]Youngwook Kim, Jae-Myung Kim, Jieun Jeong, Cordelia Schmid, Zeynep Akata, Jungwoo Lee:
Bridging the Gap Between Model Explanations in Partially Annotated Multi-Label Classification. CVPR 2023: 3408-3417 - [c253]Antoine Yang, Arsha Nagrani, Paul Hongsuck Seo, Antoine Miech, Jordi Pont-Tuset, Ivan Laptev, Josef Sivic, Cordelia Schmid:
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning. CVPR 2023: 10714-10726 - [c252]Zerui Chen, Shizhe Chen, Cordelia Schmid, Ivan Laptev:
gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction. CVPR 2023: 12890-12900 - [c251]Ahmet Iscen, Alireza Fathi, Cordelia Schmid:
Improving Image Recognition by Retrieving from Web-Scale Image-Text Data. CVPR 2023: 19295-19304 - [c250]Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid:
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR. CVPR 2023: 22922-22931 - [c249]Ziniu Hu, Ahmet Iscen, Chen Sun, Zirui Wang, Kai-Wei Chang, Yizhou Sun, Cordelia Schmid, David A. Ross, Alireza Fathi:
Reveal: Retrieval-Augmented Visual-Language Pre-Training with Multi-Source Multimodal Knowledge Memory. CVPR 2023: 23369-23379 - [c248]Shen Yan, Xuehan Xiong, Arsha Nagrani, Anurag Arnab, Zhonghao Wang, Weina Ge, David Ross, Cordelia Schmid:
UnLoc: A Unified Framework for Video Localization Tasks. ICCV 2023: 13577-13587 - [c247]Liliane Momeni, Mathilde Caron, Arsha Nagrani, Andrew Zisserman, Cordelia Schmid:
Verbs in Action: Improving verb understanding in video-language models. ICCV 2023: 15533-15545 - [c246]Karsten Roth, Jae-Myung Kim, A. Sophia Koepke, Oriol Vinyals, Cordelia Schmid, Zeynep Akata:
Waffling around for Performance: Visual Classification with Random Words and Broad Concepts. ICCV 2023: 15700-15711 - [c245]Mariana-Iuliana Georgescu, Eduardo Fonseca, Radu Tudor Ionescu, Mario Lucic, Cordelia Schmid, Anurag Arnab:
Audiovisual Masked Autoencoders. ICCV 2023: 16098-16108 - [c244]Guillaume Le Moing, Jean Ponce, Cordelia Schmid:
WALDO: Future Video Synthesis using Object Layer Decomposition and Parametric Flow Prediction. ICCV 2023: 23172-23184 - [c243]Elliot Chane-Sane, Cordelia Schmid, Ivan Laptev:
Learning Video-Conditioned Policies for Unseen Manipulation Tasks. ICRA 2023: 909-916 - [c242]Quentin Le Lidec, Wilson Jallet, Ivan Laptev, Cordelia Schmid, Justin Carpentier:
Enforcing the consensus between Trajectory Optimization and Policy Learning for precise robot control. ICRA 2023: 946-952 - [c241]Minttu Alakuijala, Gabriel Dulac-Arnold, Julien Mairal, Jean Ponce, Cordelia Schmid:
Learning Reward Functions for Robotic Manipulation by Observing Humans. ICRA 2023: 5006-5012 - [c240]Ricardo Garcia, Robin Strudel, Shizhe Chen, Etienne Arlaud, Ivan Laptev, Cordelia Schmid:
Robust Visual Sim-to-Real Transfer for Robotic Manipulation. IROS 2023: 992-999 - [c239]Shizhe Chen, Thomas Chabal, Ivan Laptev, Cordelia Schmid:
Object Goal Navigation with Recursive Implicit Maps. IROS 2023: 7089-7096 - [c238]Chen Sun, Calvin Luo, Xingyi Zhou, Anurag Arnab, Cordelia Schmid:
Does Visual Pretraining Help End-to-End Reasoning? NeurIPS 2023 - [c237]Ziniu Hu, Ahmet Iscen, Chen Sun, Kai-Wei Chang, Yizhou Sun, David Ross, Cordelia Schmid, Alireza Fathi:
AVIS: Autonomous Visual Information Seeking with Large Language Model Agent. NeurIPS 2023 - [c236]Antoine Yang, Arsha Nagrani, Ivan Laptev, Josef Sivic, Cordelia Schmid:
VidChapters-7M: Video Chapters at Scale. NeurIPS 2023 - [i173]Uddeshya Upadhyay, Jae-Myung Kim, Cordelia Schmid, Bernhard Schölkopf, Zeynep Akata:
Posterior Annealing: Fast Calibrated Uncertainty for Regression. CoRR abs/2302.11012 (2023) - [i172]Antoine Yang, Arsha Nagrani, Paul Hongsuck Seo, Antoine Miech, Jordi Pont-Tuset, Ivan Laptev, Josef Sivic, Cordelia Schmid:
Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning. CoRR abs/2302.14115 (2023) - [i171]Paul Hongsuck Seo, Arsha Nagrani, Cordelia Schmid:
AVFormer: Injecting Vision into Frozen Speech Models for Zero-Shot AV-ASR. CoRR abs/2303.16501 (2023) - [i170]Youngwook Kim, Jae-Myung Kim, Jieun Jeong, Cordelia Schmid, Zeynep Akata, Jungwoo Lee:
Bridging the Gap between Model Explanations in Partially Annotated Multi-label Classification. CoRR abs/2304.01804 (2023) - [i169]Jae-Myung Kim, A. Sophia Koepke, Cordelia Schmid, Zeynep Akata:
Exposing and Mitigating Spurious Correlations for Cross-Modal Retrieval. CoRR abs/2304.03391 (2023) - [i168]Ahmet Iscen, Alireza Fathi, Cordelia Schmid:
Improving Image Recognition by Retrieving from Web-Scale Image-Text Data. CoRR abs/2304.05173 (2023) - [i167]Quentin Le Lidec, Wilson Jallet, Louis Montaut, Ivan Laptev, Cordelia Schmid, Justin Carpentier:
Contact Models in Robotics: a Comparative Analysis. CoRR abs/2304.06372 (2023) - [i166]Liliane Momeni, Mathilde Caron, Arsha Nagrani, Andrew Zisserman, Cordelia Schmid:
Verbs in Action: Improving verb understanding in video-language models. CoRR abs/2304.06708 (2023) - [i165]Zerui Chen, Shizhe Chen, Cordelia Schmid, Ivan Laptev:
gSDF: Geometry-Driven Signed Distance Functions for 3D Hand-Object Reconstruction. CoRR abs/2304.11970 (2023) - [i164]Alexey A. Gritsenko, Xuehan Xiong, Josip Djolonga, Mostafa Dehghani, Chen Sun, Mario Lucic, Cordelia Schmid, Anurag Arnab:
End-to-End Spatio-Temporal Action Localisation with Video Transformers. CoRR abs/2304.12160 (2023) - [i163]Elliot Chane-Sane, Cordelia Schmid, Ivan Laptev:
Learning Video-Conditioned Policies for Unseen Manipulation Tasks. CoRR abs/2305.06289 (2023) - [i162]Sanjay Subramanian, Medhini Narasimhan, Kushal Khangaonkar, Kevin Yang, Arsha Nagrani, Cordelia Schmid, Andy Zeng, Trevor Darrell, Dan Klein:
Modular Visual Question Answering via Code Generation. CoRR abs/2306.05392 (2023) - [i161]Ahmet Iscen, Mathilde Caron, Alireza Fathi, Cordelia Schmid:
Retrieval-Enhanced Contrastive Vision-Text Models. CoRR abs/2306.07196 (2023) - [i160]Karsten Roth, Jae-Myung Kim, A. Sophia Koepke, Oriol Vinyals, Cordelia Schmid, Zeynep Akata:
Waffling around for Performance: Visual Classification with Random Words and Broad Concepts. CoRR abs/2306.07282 (2023) - [i159]Ziniu Hu, Ahmet Iscen, Chen Sun, Kai-Wei Chang, Yizhou Sun, David A. Ross, Cordelia Schmid, Alireza Fathi:
AVIS: Autonomous Visual Information Seeking with Large Language Models. CoRR abs/2306.08129 (2023) - [i158]Xingyi Zhou, Anurag Arnab, Chen Sun, Cordelia Schmid:
How can objects help action recognition? CoRR abs/2306.11726 (2023) - [i157]Xingyi Zhou, Anurag Arnab, Chen Sun, Cordelia Schmid:
Dense Video Object Captioning from Disjoint Supervision. CoRR abs/2306.11729 (2023) - [i156]Chen Sun, Calvin Luo, Xingyi Zhou, Anurag Arnab, Cordelia Schmid:
Does Visual Pretraining Help End-to-End Reasoning? CoRR abs/2307.08506 (2023) - [i155]Ricardo Garcia, Robin Strudel, Shizhe Chen, Etienne Arlaud, Ivan Laptev, Cordelia Schmid:
Robust Visual Sim-to-Real Transfer for Robotic Manipulation. CoRR abs/2307.15320 (2023) - [i154]Shizhe Chen, Thomas Chabal, Ivan Laptev, Cordelia Schmid:
Object Goal Navigation with Recursive Implicit Maps. CoRR abs/2308.05602 (2023) - [i153]Shen Yan, Xuehan Xiong, Arsha Nagrani, Anurag Arnab, Zhonghao Wang, Weina Ge, David Ross, Cordelia Schmid:
UnLoc: A Unified Framework for Video Localization Tasks. CoRR abs/2308.11062 (2023) - [i152]Sai Kumar Dwivedi, Cordelia Schmid, Hongwei Yi, Michael J. Black, Dimitrios Tzionas:
POCO: 3D Pose and Shape Estimation with Confidence. CoRR abs/2308.12965 (2023) - [i151]Lucas Ventura, Antoine Yang, Cordelia Schmid, Gül Varol:
CoVR: Learning Composed Video Retrieval from Web Video Captions. CoRR abs/2308.14746 (2023) - [i150]Antoine Yang, Arsha Nagrani, Ivan Laptev, Josef Sivic, Cordelia Schmid:
VidChapters-7M: Video Chapters at Scale. CoRR abs/2309.13952 (2023) - [i149]Shizhe Chen, Ricardo Garcia, Cordelia Schmid, Ivan Laptev:
PolarNet: 3D Point Clouds for Language-Guided Robotic Manipulation. CoRR abs/2309.15596 (2023) - [i148]Guillaume Le Moing, Jean Ponce, Cordelia Schmid:
Dense Optical Tracking: Connecting the Dots. CoRR abs/2312.00786 (2023) - [i147]Jiarui Xu, Xingyi Zhou, Shen Yan, Xiuye Gu, Anurag Arnab, Chen Sun, Xiaolong Wang, Cordelia Schmid:
Pixel Aligned Language Models. CoRR abs/2312.09237 (2023) - 2022
- [c235]Ahmet Iscen, Thomas Bird, Mathilde Caron, Alireza Fathi, Cordelia Schmid:
A Memory Transformer Network for Incremental Learning. BMVC 2022: 388 - [c234]Pierre-Louis Guhur, Shizhe Chen, Ricardo Garcia, Makarand Tapaswi, Ivan Laptev, Cordelia Schmid:
Instruction-driven history-aware policies for robotic manipulations. CoRL 2022: 175-187 - [c233]Shen Yan, Xuehan Xiong, Anurag Arnab, Zhichao Lu, Mi Zhang, Chen Sun, Cordelia Schmid:
Multiview Transformers for Video Recognition. CVPR 2022: 3323-3333 - [c232]Ahmet Iscen, Jack Valmadre, Anurag Arnab, Cordelia Schmid:
Learning with Neighbor Consistency for Noisy Labels. CVPR 2022: 4662-4671 - [c231]Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid:
TubeDETR: Spatio-Temporal Video Grounding with Transformers. CVPR 2022: 16421-16432 - [c230]Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev:
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation. CVPR 2022: 16516-16526 - [c229]Paul Hongsuck Seo, Arsha Nagrani, Anurag Arnab, Cordelia Schmid:
End-to-end Generative Pretraining for Multimodal Video Captioning. CVPR 2022: 17938-17947 - [c228]Zerui Chen, Yana Hasson, Cordelia Schmid, Ivan Laptev:
AlignSDF: Pose-Aligned Signed Distance Fields for Hand-Object Reconstruction. ECCV (1) 2022: 231-248 - [c227]Arsha Nagrani, Paul Hongsuck Seo, Bryan Seybold, Anja Hauth, Santiago Manen, Chen Sun, Cordelia Schmid:
Learning Audio-Video Modalities from Image Captions. ECCV (14) 2022: 407-426 - [c226]Medhini Narasimhan, Arsha Nagrani, Chen Sun, Michael Rubinstein, Trevor Darrell, Anna Rohrbach, Cordelia Schmid:
TL;DW? Summarizing Instructional Videos with Task Relevance and Cross-Modal Saliency. ECCV (34) 2022: 540-557 - [c225]Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev:
Learning from Unlabeled 3D Environments for Vision-and-Language Navigation. ECCV (39) 2022: 638-655 - [c224]Valentin Gabeur, Paul Hongsuck Seo, Arsha Nagrani, Chen Sun, Karteek Alahari, Cordelia Schmid:
AVATAR: Unconstrained Audiovisual Speech Recognition. INTERSPEECH 2022: 2818-2822 - [c223]Thomas Chabal, Robin Strudel, Etienne Arlaud, Jean Ponce, Cordelia Schmid:
Assembly Planning from Observations under Physical Constraints. IROS 2022: 10223-10229 - [c222]Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev:
Language Conditioned Spatial Relation Reasoning for 3D Object Grounding. NeurIPS 2022 - [c221]Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid:
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models. NeurIPS 2022 - [c220]Valentin Gabeur, Arsha Nagrani, Chen Sun, Karteek Alahari, Cordelia Schmid:
Masking Modalities for Cross-modal Video Retrieval. WACV 2022: 2111-2120 - [i146]Shen Yan, Xuehan Xiong, Anurag Arnab, Zhichao Lu, Mi Zhang, Chen Sun, Cordelia Schmid:
Multiview Transformers for Video Recognition. CoRR abs/2201.04288 (2022) - [i145]Paul Hongsuck Seo, Arsha Nagrani, Anurag Arnab, Cordelia Schmid:
End-to-end Generative Pretraining for Multimodal Video Captioning. CoRR abs/2201.08264 (2022) - [i144]Ahmet Iscen, Jack Valmadre, Anurag Arnab, Cordelia Schmid:
Learning with Neighbor Consistency for Noisy Labels. CoRR abs/2202.02200 (2022) - [i143]Shizhe Chen, Pierre-Louis Guhur, Makarand Tapaswi, Cordelia Schmid, Ivan Laptev:
Think Global, Act Local: Dual-scale Graph Transformer for Vision-and-Language Navigation. CoRR abs/2202.11742 (2022) - [i142]Pia Bideau, Erik G. Learned-Miller, Cordelia Schmid, Karteek Alahari:
The Right Spin: Learning Object Motion from Rotation-Compensated Flow Fields. CoRR abs/2203.00115 (2022) - [i141]Quentin Le Lidec, Louis Montaut, Cordelia Schmid, Ivan Laptev, Justin Carpentier:
Leveraging Randomized Smoothing for Optimal Control of Nonsmooth Dynamical Systems. CoRR abs/2203.03986 (2022) - [i140]Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid:
TubeDETR: Spatio-Temporal Video Grounding with Transformers. CoRR abs/2203.16434 (2022) - [i139]Arsha Nagrani, Paul Hongsuck Seo, Bryan Seybold, Anja Hauth, Santiago Manen, Chen Sun, Cordelia Schmid:
Learning Audio-Video Modalities from Image Captions. CoRR abs/2204.00679 (2022) - [i138]Thomas Chabal, Robin Strudel, Etienne Arlaud, Jean Ponce, Cordelia Schmid:
Assembly Planning from Observations under Physical Constraints. CoRR abs/2204.09616 (2022) - [i137]Robin Strudel, Ivan Laptev, Cordelia Schmid:
Weakly-supervised segmentation of referring expressions. CoRR abs/2205.04725 (2022) - [i136]Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid:
Learning to Answer Visual Questions from Web Videos. CoRR abs/2205.05019 (2022) - [i135]Valentin Gabeur, Paul Hongsuck Seo, Arsha Nagrani, Chen Sun, Karteek Alahari, Cordelia Schmid:
AVATAR: Unconstrained Audiovisual Speech Recognition. CoRR abs/2206.07684 (2022) - [i134]Antoine Yang, Antoine Miech, Josef Sivic, Ivan Laptev, Cordelia Schmid:
Zero-Shot Video Question Answering via Frozen Bidirectional Language Models. CoRR abs/2206.08155 (2022) - [i133]Xuehan Xiong, Anurag Arnab, Arsha Nagrani, Cordelia Schmid:
M&M Mix: A Multimodal Multiview Transformer Ensemble. CoRR abs/2206.09852 (2022) - [i132]