


default search action
CVPR 2024: Seattle, WA, USA - Workshops
- IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 - Workshops, Seattle, WA, USA, June 17-18, 2024. IEEE 2024, ISBN 979-8-3503-6547-4
- Jaime Spencer, Fabio Tosi, Matteo Poggi, Ripudaman Singh Arora, Chris Russell, Simon Hadfield, Richard Bowden, GuangYuan Zhou, ZhengXin Li, Qiang Rao, YiPing Bao, Xiao Liu, Dohyeong Kim, Jinseong Kim, Myunghyun Kim, Mykola Lavreniuk, Rui Li, Qing Mao, Jiang Wu, Yu Zhu, Jinqiu Sun, Yanning Zhang, Suraj Patni, Aradhye Agarwal, Chetan Arora, Pihai Sun, Kui Jiang, Gang Wu, Jian Liu, Xianming Liu, Junjun Jiang, Xidan Zhang, Jianing Wei, Fangjun Wang, Zhiming Tan, Jiabao Wang, Albert Luginov, Muhammad Shahzad, Seyed Hosseini, Aleksander Trajcevski, James H. Elder:
The Third Monocular Depth Estimation Challenge. 1-14 - Wenhui Chang, Hongming Chen, Xin He, Xiang Chen
, Liangduo Shen:
UAV-Rain1k: A Benchmark for Raindrop Removal from UAV Aerial Imagery. 15-22 - Chuheng Wei
, Guoyuan Wu, Matthew J. Barth:
Feature Corrective Transfer Learning: End-to-End Solutions to Object Detection in Non-Ideal Visual Conditions. 23-32 - Alexis Guichemerre, Soufiane Belharbi, Tsiry Mayet, Shakeeb Murtaza, Pourya Shamsolmoali, Luke McCaffrey, Eric Granger:
Source-Free Domain Adaptation of Weakly-Supervised Object Localization Models for Histology. 33-43 - Pavan C. Madhusudana, Jing Li, Zeeshan Nadir, Hamid R. Sheikh, Seok-Jun Lee:
Mobile Aware Denoiser Network (MADNet) for Quad Bayer Images. 44-52 - Tak Ming Wong
, Julian Moosmann
, Berit Zeller-Plumhoff:
VolRAFT: Volumetric Optical Flow Network for Digital Volume Correlation of Synchrotron Radiation-based Micro-CT Images of Bone-Implant Interfaces. 53-62 - Pragyan Banerjee, Pranjal Saxena, Nur M. M. Kalimullah, Amit Shelke, Anowarul Habib:
Damage Detection and Localization by Learning Deep Features of Elastic Waves in Piezoelectric Ceramic Using Point Contact Method. 63-70 - Bashir Kazimi, Karina Ruzaeva
, Stefan Sandfeld:
Self-Supervised Learning with Generative Adversarial Networks for Electron Microscopy. 71-81 - Heiko Karus, Friedhelm Schwenker, Michael Munz
, Michael Teutsch:
Towards Explainable Visual Vessel Recognition Using Fine-Grained Classification and Image Retrieval. 82-92 - Dasol Choi, Soora Choi, Eunsun Lee, Jinwoo Seo, Dongbin Na:
Towards Efficient Machine Unlearning with Data Augmentation: Guided Loss-Increasing (GLI) to Prevent the Catastrophic Model Utility Drop. 93-102 - Jensen Hwa, Qingyu Zhao, Aditya Lahiri, Adnan Masood, Babak Salimi, Ehsan Adeli
:
Enforcing Conditional Independence for Fair Representation Learning and Causal Image Generation. 103-112 - Trung-Hieu Hoang, Mona Zehni, Huy Phan, Duc Minh Vo, Minh N. Do:
Improving the Robustness of 3D Human Pose Estimation: A Benchmark Dataset and Learning from Noisy Input. 113-123 - Dake Chen, Shiduo Li, Yuke Zhang, Chenghao Li, Souvik Kundu, Peter A. Beerel:
DIA: Diffusion based Inverse Network Attack on Collaborative Inference. 124-130 - Sudarshan Regmi, Bibek Panthi, Yifei Ming, Prashnna K. Gyawali, Danail Stoyanov, Binod Bhattarai
:
ReweightOOD: Loss Reweighting for Distance-based OOD Detection. 131-141 - Aman Bhatta, Domingo Mery, Haiyu Wu, Joyce Annan, Michael C. King, Kevin W. Bowyer:
Our Deep CNN Face Matchers Have Developed Achromatopsia. 142-152 - Sudarshan Regmi, Bibek Panthi, Sakar Dotel, Prashnna K. Gyawali, Danail Stoyanov, Binod Bhattarai
:
T2FNorm: Train-time Feature Normalization for OOD Detection in Image Classification. 153-162 - Cynthia Ifeyinwa Ugwu, Sofia Casarin, Oswald Lanz
:
Fractals as Pre-training Datasets for Anomaly Detection and Localization. 163-172 - Akshay Mehra, Yunbei Zhang, Jihun Hamm:
Test-time Assessment of a Model's Performance on Unseen Domains via Optimal Transport. 173-182 - Zheming Zuo
, Joseph Smith, Jonathan Stonehouse, Boguslaw Obara:
Robust and Explainable Fine-Grained Visual Classification with Transfer Learning: A Dual-Carriageway Framework. 183-193 - Yifan Shen
, Zhengyuan Li, Gang Wang:
Practical Region-level Attack against Segment Anything Models. 194-203 - Faridoun Mehri
, Mohsen Fayyaz, Mahdieh Soleymani Baghshah, Mohammad Taher Pilehvar:
SkipPLUS: Skip the First Few Layers to Better Explain Vision Transformers. 204-215 - Achref Doula, Max Mühlhäuser, Alejandro Sánchez Guinea:
AR-CP: Uncertainty-Aware Perception in Adverse Conditions with Conformal Prediction and Augmented Reality For Assisted Driving. 216-226 - Guihong Li, Hsiang Hsu, Chun-Fu Richard Chen, Radu Marculescu:
Fast-NTK: Parameter-Efficient Unlearning for Large-Scale Models. 227-234 - Sander De Coninck
, Sam Leroux, Pieter Simoens:
Mitigating Bias Using Model-Agnostic Data Attribution. 235-243 - Sreetama Sarkar, Souvik Kundu, Peter A. Beerel:
RLNet: Robust Linearized Networks for Efficient Private Inference. 244-253 - Gaurav Kumar Nayak, Inder Khatri, Ruchit Rawal, Anirban Chakraborty:
Data-free Defense of Black Box Models Against Adversarial Attacks. 254-263 - Dayvid Castro, Byron Leite Dantas Bezerra, Cleber Zanchettin:
An End-to-End Approach for Handwriting Recognition: From Handwritten Text Lines to Complete Pages. 264-273 - Iván Reyes-Amezcua, Gilberto Ochoa-Ruiz, Andres Mendez-Vazquez:
Enhancing Image Classification Robustness through Adversarial Sampling with Delta Data Augmentation (DDA). 274-283 - Luiz Schirmer, Guilherme G. Schardong, Vinícius da Silva, Rogério Santos, Hélio Lopes:
High-Resolution Detection of Earth Structural Heterogeneities from Seismic Amplitudes using Convolutional Neural Networks with Attention layers. 284-292 - Fabian Perez, Hoover Rueda-Chacon:
Beyond Appearances: Material Segmentation with Embedded Spectral Information from RGB-D imagery. 293-301 - Maria Luísa Lima, Willams de Lima Costa, Estefania Talavera Martínez
, Veronica Teichrieb:
ST-Gait++: Leveraging spatio-temporal convolutions for gait-based emotion recognition on videos. 302-310 - Ramon Izquierdo-Cordova, Walterio W. Mayol-Cuevas:
The Myth of the Pyramid. 311-321 - Hao Lu, Xuesong Niu, Jiyao Wang, Yin Wang, Qingyong Hu, Jiaqi Tang, Yuting Zhang, Kaishen Yuan, Bin Huang, Zitong Yu, Dengbo He, Shuiguang Deng, Hao Chen, Yingcong Chen, Shiguang Shan:
GPT as Psychologist? Preliminary Evaluations for GPT-4V on Visual Affective Computing. 322-331 - Yash Khandelwal, Mayur Arvind, Sriram Kumar, Ashish Gupta, Sachin Kumar Danisetty, Piyush Bagad, Anish Madan, Mayank Lunayach, Aditya Annavajjala, Abhishek Maiti, Sansiddh Jain, Aman Dalmia, Namrata Deka, Jerome White, Jigar Doshi, Angjoo Kanazawa, Rahul Panicker, Alpan Raval, Srinivas Rana
, Makarand Tapaswi:
NurtureNet: A Multi-task Video-based Approach for Newborn Anthropometry. 332-342 - Felipe Parodi, Jordan K. Matelsky, Alejandra Regla-Vargas, Elizabeth E. Foglia, Charis Lim, Danielle Weinberg, Konrad P. Kording
, Heidi M. Herrick, Michael L. Platt:
Vision-language models for decoding provider attention during neonatal resuscitation. 343-353 - Sam Cantrill, David Ahmedt-Aristizabal
, Lars Petersson, Hanna Suominen
, Mohammad Ali Armin:
Orientation-conditioned Facial Texture Mapping for Video-based Facial Remote Photoplethysmography Estimation. 354-363 - Simon Wegerif, Ivan Veleslavov, Lieke Dorine van Putten, Kate Emily Bamford, Gauri Misra, Niall Mullen:
Paediatric Pulse Rate Measurements: a Comparison of Methods using Remote Photoplethysmography. 364-370 - Zhekang Dong, Chenhao Hu, Shiqi Zhou, Liyan Zhu, Junfan Wang, Yi Chen, Xudong Lv, Xiaoyue Ji:
DECNet: A Non-Contacting Dual-Modality Emotion Classification Network for Driver Health Monitoring. 371-379 - Nathan Vance, Patrick J. Flynn:
Refining Remote Photoplethysmography Architectures using CKA and Empirical Methods. 380-388 - Alexander Vedernikov, Zhaodong Sun, Virpi-Liisa Kykyri
, Mikko Pohjola, Miriam Nokia
, Xiaobai Li:
Analyzing Participants' Engagement during Online Meetings Using Unsupervised Remote Photoplethysmography with Behavioral Features. 389-399 - Priya Singh, Abhishek Pathak, Umer Jon Ganai
, Braj Bhushan, Venkatesh K. Subramanian:
Video Based Computational Coding of Movement Anomalies in ASD Children. 400-409 - Björn Braun, Daniel McDuff, Christian Holz:
How Suboptimal is Training rPPG Models with Videos and Targets from Different Body Sites? 410-418 - Jie Zhao
, Zhitong Xiong, Xiao Xiang Zhu:
UrbanSARFloods: Sentinel-1 SLC-Based Benchmark Dataset for Urban and Open-Area Flood Mapping. 419-429 - Xavier Bou, Gabriele Facciolo, Rafael Grompone von Gioi, Jean-Michel Morel
, Thibaud Ehret:
Exploring Robust Features for Few-Shot Object Detection in Satellite Imagery. 430-439 - Sarra Khairi, Etienne Meunier, Renaud Fraisse, Patrick Bouthemy:
Efficient local correlation volume for unsupervised optical flow estimation on small moving objects in large satellite images. 440-448 - Yongquan Qu, Juan Nathaniel, Shuolin Li, Pierre Gentine
:
Deep Generative Data Assimilation in Multimodal Setting. 449-459 - Srikumar Sastry, Subash Khanal, Aayush Dhakal, Nathan Jacobs:
GeoSynth: Contextually-Aware High-Resolution Satellite Image Synthesis. 460-470 - Patrick Ebel, Brandon Victor, Peter Naylor, Gabriele Meoni, Federico Serva, Rochelle Schneider:
Implicit Assimilation of Sparse In Situ Data for Dense & Global Storm Surge Forecasting. 471-480 - Georges Le Bellier, Nicolas Audebert:
Detecting Out-Of-Distribution Earth Observation Images with Diffusion Models. 481-491 - Alex Stoken
, Peter Ilhardt, Mark Lambert, Kenton Fisher:
(Street) Lights Will Guide You: Georeferencing Nighttime Astronaut Photography of Earth. 492-501 - Aimi Okabayashi, Nicolas Audebert, Simon Donike, Charlotte Pelletier:
Cross-sensor super-resolution of irregularly sampled Sentinel-2 time series. 502-511 - Vasudha Venkatesan, Daniel Panangian, Mario Fuentes Reyes, Ksenia Bittner:
SyntStereo2Real: Edge-Aware GAN for Remote Sensing Image-to-Image Translation while Maintaining Stereo Constraint. 512-521 - Nikhil Behari, Akshat Dave, Kushagra Tiwary, William Yang, Ramesh Raskar:
SUNDIAL: 3D Satellite Understanding through Direct, Ambient, and Complex Lighting Decomposition. 522-532 - Aayush Dhakal, Adeel Ahmad, Subash Khanal, Srikumar Sastry, Hannah Kerner, Nathan Jacobs:
Sat2Cap: Mapping Fine-Grained Textual Descriptions from Satellite Images. 533-542 - Clifford Broni-Bediako, Junshi Xia, Naoto Yokoya
:
Unsupervised Domain Adaptation Architecture Search with Self-Training for Land Cover Mapping. 543-553 - Jonathan Roberts, Timo Lüddecke, Rehan Sheikh, Kai Han, Samuel Albanie:
Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs. 554-563 - Thibaud Ehret, Roger Marí
, Dawa Derksen, Nicolas Gasnier, Gabriele Facciolo:
Radar Fields: An Extension of Radiance Fields to SAR. 564-574 - Ivica Obadic, Alex Levering, Lars Pennig, Dário A. B. Oliveira, Diego Marcos, Xiaoxiang Zhu:
Contrastive Pretraining for Visual Concept Explanations of Socioeconomic Outcomes. 575-584 - Simranjit Singh, Michael Fore, Dimitrios Stamoulis:
GeoLLM-Engine: A Realistic Environment for Building Geospatial Copilots. 585-594 - Rudhishna Narayanan Nair, Ronny Hänsch:
Let me show you how it's done - Cross-modal knowledge distillation as pretext task for semantic segmentation. 595-603 - Swati Jindal, Mohit Yadav, Roberto Manduchi:
Spatio-Temporal Attention and Gaussian Processes for Personalized Video Gaze Estimation. 604-614 - Anshul Gupta, Pierre Vuillecard, Arya Farkhondeh, Jean-Marc Odobez:
Exploring the Zero-Shot Capabilities of Vision-Language Models for Improving Gaze Following. 615-624 - Takumi Nishiyasu, Yoichi Sato:
Gaze Scanpath Transformer: Predicting Visual Search Target by Spatiotemporal Semantic Modeling of Gaze Scanpath. 625-635 - Athul M. Mathew, Arshad Ali Khan, Thariq Khalid, Riad Souissi:
GESCAM : A Dataset and Method on Gaze Estimation for Classroom Attention Measurement. 636-645 - Xin Yue, Zongqing Lu, Xiangru Lin, Wenjia Ren, Zhijing Shao, Haonan Hu, Yu Zhang, Qingmin Liao:
Semi-Stereo: A Universal Stereo Matching Framework for Imperfect Data via Semi-supervised Learning. 646-655 - Runfa Li, Upal Mahbub, Vasudev Bhaskaran, Truong Q. Nguyen:
MonoSelfRecon: Purely Self-Supervised Explicit Generalizable 3D Reconstruction of Indoor Scenes from Monocular RGB Views. 656-666 - Torben Teepe, Philipp Wolters, Johannes Gilg, Fabian Herzog, Gerhard Rigoll:
Lifting Multi-View Detection and Tracking to the Bird's Eye View. 667-676 - Jin Gyu Hong, Seung Young Noh, Hee Kyung Lee, Won-Sik Cheong, Ju Yong Chang:
3D Clothed Human Reconstruction from Sparse Multi-View Images. 677-687 - Jérôme Revaud, Yohann Cabon, Romain Brégier, JongMin Lee, Philippe Weinzaepfel:
SACReg: Scene-Agnostic Coordinate Regression for Visual Localization. 688-698 - Yunhui Zhu, Jiajing Chen, Senem Velipasalar:
DepthVoting: A Few-Shot Point Cloud Classification Model Incorporating a Projection-Based Voting Mechanism. 699-707 - Amaya Dharmasiri, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan:
Cross-Modal Self-Training: Aligning Images and Pointclouds to learn Classification without Labels. 708-717 - Kalyani Marathe, Mahtab Bigverdi, Nishat Khan, Tuhin Kundu, Patrick Howe, Sharan Ranjit S, Anand Bhattad, Aniruddha Kembhavi, Linda G. Shapiro, Ranjay Krishna:
MIMIC: Masked Image Modeling with Image Correspondences. 718-727 - Mona Saleh Alzahrani
, Muhammad Usman, Saeed Anwar, Tarek Helmy:
Selective Multi-View Deep Model for 3D Object Classification. 728-736 - Wonseok Oh, Youngjoo Jo:
From 2D Portraits to 3D Realities: Advancing GAN Inversion for Enhanced Image Synthesis. 737-746 - Hovhannes Margaryan, Daniil Hayrapetyan, Wenyan Cong, Zhangyang Wang, Humphrey Shi:
DGBD: Depth Guided Branched Diffusion for Comprehensive Controllability in Multi-View Generation. 747-756 - Mansi Sharma, Rohit Choudhary, Rithvik Anil:
2T-UNET: A Two-Tower UNet with Depth Clues for Robust Stereo Depth Estimation. 757-764 - Guoxian Song:
AgileGAN3D: Few-Shot 3D Portrait Stylization by Augmented Transfer Learning. 765-774 - Sieun Kim
, Kyungjin Lee, Youngki Lee:
Color-cued Efficient Densification Method for 3D Gaussian Splatting. 775-783 - Huantao Ren, Jiyang Wang, Minmin Yang, Senem Velipasalar:
PointOfView: A Multi-modal Network for Few-shot 3D Point Cloud Classification Fusing Point and Multi-view Image Features. 784-793 - Dae Yeol Lee, Guan-Ming Su, Peng Yin:
OGRMPI: An Efficient Multiview Integrated Multiplane Image based on Occlusion Guided Residuals. 794-802 - Yik Lung Pang, Changjae Oh, Andrea Cavallaro:
Sparse multi-view hand-object reconstruction for unseen environments. 803-810 - Jaeyoung Chung, Jeongtaek Oh, Kyoung Mu Lee:
Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Images. 811-820 - Thibault Castells, Hyoung-Kyu Song, Bo-Kyeong Kim, Shinkook Choi:
LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights. 821-830 - Min-Hui Lin, Mahesh Reddy, Guillaume Berger, Michel Sarkis, Fatih Porikli
, Ning Bi:
EdgeRelight360: Text-Conditioned 360-Degree HDR Image Generation for Real-Time On-Device Video Portrait Relighting. 831-840 - Samuel Cerezo, Javier Civera:
Camera Motion Estimation from RGB-D-Inertial Scene Flow. 841-849 - Sainan Liu, Shan Lin
, Jingpei Lu, Alexey Supikov, Michael C. Yip:
BAA-NGP: Bundle-Adjusting Accelerated Neural Graphics Primitives. 850-857 - Amin Abouee, Ashwanth Ravi, Lars Hinneburg, Mateusz Dziwulski, Florian Ölsner, Jürgen Hess, Stefan Milz, Patrick Mäder:
Weakly Supervised End2End Deep Visual Odometry. 858-865 - Francesco Ballerini, Pierluigi Zama Ramirez
, Roberto Mirabella, Samuele Salti, Luigi Di Stefano:
Connecting NeRFs, Images, and Text. 866-876 - Theo W. Costain, Kejie Li, Victor Adrian Prisacariu:
Contextualising Implicit Representations for Semantic Tasks. 877-887 - Monsij Biswal, Tong Shao, Kenneth Rose, Peng Yin, Sean McCarthy:
StegaNeRV: Video Steganography using Implicit Neural Representation. 888-898 - Haoan Feng, Xin Xu, Leila De Floriani:
ImplicitTerrain: a Continuous Surface Model for Terrain Data Analysis. 899-909 - Jianbo Wang, Heliang Zheng, Toshihiko Yamasaki:
Reference-based GAN Evaluation by Adaptive Inversion. 910-918 - Haocheng Yuan, Ajian Liu, Junze Zheng, Jun Wan, Jiankang Deng
, Sergio Escalera
, Hugo Jair Escalante, Isabelle Guyon, Zhen Lei:
Unified Physical-Digital Attack Detection Challenge. 919-929 - Hang Zou, Hui Zhang, Yuan Zhang, Hui Ma, Dexin Zhao, Qi Zhang, Qi Li:
Multi-angle Consistent Generative NeRF with Additive Angular Margin Momentum Contrastive Learning. 930-939 - Michail Tarasiou, Jiankang Deng
, Stefanos Zafeiriou:
Rethinking the Domain Gap in Near-infrared Face Recognition. 940-949 - Siying Cui, Jia Guo, Xiang An, Jiankang Deng
, Yongle Zhao, Xinyu Wei, Ziyong Feng:
IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models. 950-959 - Jiaruo Yu, Dagong Lu, Xingyue Shi, Chenfan Qu, Fengjun Guo:
Unified Face Attack Detection with Micro Disturbance and a Two-Stage Training Strategy. 960-969 - Hyojin Kim, Jiyoon Lee, Yonghyun Jeong, Haneol Jang, Youngjoon Yoo:
Advancing Cross-Domain Generalizability in Face Anti-Spoofing: Insights, Design, and Metrics. 970-979 - Chuanbiao Song, Yan Hong, Jun Lan, Huijia Zhu, Weiqiang Wang, Jianfu Zhang:
Supervised Contrastive Learning for Snapshot Spectral Imaging Face Anti-Spoofing. 980-985 - Minzhe Huang, Changwei Nie, Weihong Zhong:
A visualization method for data domain changes in CNN networks and the optimization method for selecting thresholds in classification tasks. 986-994 - Xianhua He, Dashuang Liang, Song Yang, Zhanlong Hao, Hui Ma, Binjie Mao, Xi Li, Yao Wang, Pengfei Yan, Ajian Liu:
Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues. 995-1004 - Hui Li, Yaowen Xu, Zhaofan Zou, Zhixiang He:
Snapshot Spectral Imaging for Face Anti-Spoofing: Addressing Data Challenges with Advanced Processing and Training. 1005-1012 - Sabari Nathan, M. Parisa Beham, A Nagaraj, S. Mohamed Mansoor Roomi:
Multiattention-Net: A Novel Approach to Face Anti-Spoofing with Modified Squeezed Residual Blocks. 1013-1020 - Luis S. Luevano, Yoanna Martínez-Díaz, Heydi Méndez-Vázquez, Miguel González-Mendoza, Davide Frey:
Assessing the Performance of Efficient Face Anti-Spoofing Detection Against Physical and Digital Presentation Attacks. 1021-1028 - Kota Yamashita, Kazuhiro Hotta:
MixStyle-Based Contrastive Test-Time Adaptation: Pathway to Domain Generalization. 1029-1037 - Xiaoqian Ruan, Wei Tang:
Fully Test-time Adaptation for Object Detection. 1038-1047 - Sam Leroux, Dewant Katare
, Aaron Yi Ding
, Pieter Simoens:
Test-time Specialization of Dynamic Neural Networks. 1048-1056 - Masud An Nur Islam Fahim, Mohammed Innat, Jani Boutellier:
ST2ST: Self-Supervised Test-time Adaptation for Video Action Recognition. 1057-1066 - Chowdhury Sadman Jahan, Andreas E. Savakis:
Unknown Sample Discovery for Source Free Open Set Domain Adaptation. 1067-1076 - Chengyu Wang, Jing Li, Pavan C. Madhusudanarao, Jinhan Hu, Jitesh K. Singh, WooJhon Choi, Seok-Jun Lee, Hamid R. Sheikh:
UDAC: Under-Display Array Cameras. 1077-1084 - Jiahao Qin, Pinle Qin, Rui Chai, Jia Qin, Zanxia Jin:
EL2NM: Extremely Low-light Noise Modeling Through Diffusion Iteration. 1085-1094 - Yunfan Lu, Yijie Xu, Wenzong Ma, Weiyu Guo, Hui Xiong:
Event Camera Demosaicing via Swin Transformer and Pixel-focus Loss. 1095-1105 - Ruoqi Li, Chang Liu, Ziyi Wang, Yao Du, Jingjing Yang, Long Bao, Heng Sun:
From Synthetic to Real: A Calibration-free Pipeline for Few-shot Raw Image Denoising. 1106-1114 - Xuhui Liu, Bohan Zeng, Sicheng Gao, Shanglin Li, Yutang Feng, Hong Li, Boyu Liu, Jianzhuang Liu, Baochang Zhang:
LaDiffGAN: Training GANs with Diffusion Supervision in Latent Spaces. 1115-1125 - Senyan Xu, Zhijing Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha:
DemosaicFormer: Coarse-to-Fine Demosaicing Network for HybridEVS Camera. 1126-1135 - Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li
, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhijing Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Haijin Zeng, Kai Feng, Yongyong Chen, Jingyong Su, Xianyu Guan, Hongyuan Yu, Cheng Wan, Jiamin Lin, Binnan Han, Yajun Zou, Zhuoyuan Wu, Yuan Huang, Yongsheng Yu, Daoan Zhang, Jizhe Li, Xuanwu Yin, Kunlong Zuo, Yunfan Lu, Yijie Xu, Wenzong Ma, Weiyu Guo, Hui Xiong, Wei Yu, Bingchun Luo, Sabari Nathan, Priya Kansal:
MIPI 2024 Challenge on Demosaic for Hybridevs Camera: Methods and Results. 1136-1143 - Yuekun Dai, Dafeng Zhang, Xiaoming Li, Zongsheng Yue, Chongyi Li
, Shangchen Zhou, Ruicheng Feng, Peiqing Yang, Zhezhu Jin, Guanqun Liu, Chen Change Loy:
MIPI 2024 Challenge on Nighttime Flare Removal: Methods and Results. 1144-1152 - Xin Jin, Chunle Guo, Xiaoming Li, Zongsheng Yue, Chongyi Li
, Shangchen Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Ruoqi Li, Chang Liu, Ziyi Wang, Yao Du, Jingjing Yang, Long Bao, Heng Sun, Xiangyu Kong, Xiaoxia Xing, Jinlong Wu, Yuanyang Xue, Hyunhee Park, Sejun Song, Changho Kim, Jingfan Tan, Wenhan Luo, Zikun Liu, Mingde Qiao, Junjun Jiang, Kui Jiang, Yao Xiao, Chuyang Sun, Jinhui Hu, Weijian Ruan, Yubo Dong, Kai Chen, Hyejeong Jo, Jiahao Qin, Bingjie Han, Pinle Qin, Rui Chai, Pengyuan Wang:
MIPI 2024 Challenge on Few-shot RAW Image Denoising: Methods and Results. 1153-1161 - Tommie Kerssies, Daan de Geus
, Gijs Dubbelman:
How to Benchmark Vision Foundation Models for Semantic Segmentation? 1162-1171 - Brunó Bence Englert, Fabrizio J. Piva, Tommie Kerssies, Daan de Geus
, Gijs Dubbelman:
Exploring the Benefits of Vision Foundation Models for Unsupervised Domain Adaptation. 1172-1180 - Radu Dondera:
Towards Learning Image Similarity from General Triplet Labels. 1181-1190 - Davide Moltisanti
, Hakan Bilen, Laura Sevilla-Lara, Frank Keller:
Coarse or Fine? Recognising Action End States without Labels. 1191-1200 - Oriol Barbany, Michael Huang, Xinliang Zhu, Arnab Dhua:
Leveraging Large Language Models for Multimodal Search. 1201-1210 - Kam Woh Ng, Xiatian Zhu, Yi-Zhe Song, Tao Xiang:
ConceptHash: Interpretable Fine-Grained Hashing via Concept Discovery. 1211-1223 - Tarun Sharma, Danelle E. Cline, Duane Edgington:
Making use of unlabeled data: Comparing strategies for marine animal detection in long-tailed datasets using self-supervised and semi-supervised pre-training. 1224-1233 - William Michael Laprade, Pawel Tomasz Pieta
, Svetlana Kutuzova
, Jesper Cairo Westergaard, Mads Nielsen, Svend Christensen, Anders Bjorholm Dahl
:
HyperLeaf2024 - A Hyperspectral Imaging Dataset for Classification and Regression of Wheat Leaves. 1234-1243 - Tarun Sharma, Julian Morgan Wagner, Sara Beery, William B. Dickson, Michael H. Dickinson, Joseph Parker:
Monitoring Social Insect Activity with Minimal Human Supervision. 1244-1253 - Hannes Reichert, Manuel Hetzel, Andreas Hubert, Konrad Doll, Bernhard Sick:
Sensor Equivariance: A Framework for Semantic Segmentation with Diverse Camera Models. 1254-1261 - Jingguo Liu, Yijun Xu
, Shigang Li, Jianfeng Li:
Estimating Depth of Monocular Panoramic Image with Teacher-Student Model Fusing Equirectangular and Spherical Representations. 1262-1271 - Jiajing Chen, Zhiqiang Wan, Manjunath Narayana, Yuguang Li, Will Hutchcroft, Senem Velipasalar, Sing Bing Kang:
BGDNet: Background-guided Indoor Panorama Depth Estimation. 1272-1281 - Cing-Jia Lin, Jheng-Wei Su, Kai-Wen Hsiao, Ting-Yu Yen, Chih-Yuan Yao, Hung-Kuo Chu
:
DQ-HorizonNet: Enhancing Door Detection Accuracy in Panoramic Images via Dynamic Quantization. 1282-1289 - Jay Bhanushali, Manivannan Muniyandi, Praneeth Chakravarthula:
Cross-Domain Synthetic-to-Real In-the-Wild Depth and Normal Estimation for 3D Scene Understanding. 1290-1300 - Madhumitha Sakthi, Louis Kerofsky, Varun Ravi Kumar, Senthil Kumar Yogamani:
Impact of Video Compression Artifacts on Fisheye Camera Visual Perception Tasks. 1301-1310 - Uzair Shah, Muhammad Tukur, Mahmood Alzubaidi, Giovanni Pintore, Enrico Gobbetti, Mowafa S. Househ, Jens Schneider, Marco Agus:
MultiPanoWise: holistic deep architecture for multi-task dense prediction from a single panoramic image. 1311-1321 - Yuhao Shan, Heyu Chen, Jiaying Zhang, Shigang Li, Jianfeng Li:
Multi-scale Attention-Based Inclination Angles Estimation for Panoramic Camera. 1322-1330 - Senthil Kumar Yogamani, David Unger, Venkatraman Narayanan, Varun Ravi Kumar:
FisheyeBEVSeg: Surround View Fisheye Cameras based Bird's-Eye View Segmentation for Autonomous Driving. 1331-1334 - Filip Slezak, Morten Stigaard Laursen, Thomas B. Moeslund
:
Exploring the Limits: Applying State-of-the-Art Stereo Matching Algorithms to Rectified Ultra-Wide Stereo. 1335-1344 - Haiyang Jiang, Zhihang Zhong, Yinqiang Zheng:
Gain-first or Exposure-first: Benchmark for Better Low-light Video Photography and Enhancement. 1345-1356 - Tianqi Ren, Qiu Shen, Ying Fu, Shaodi You:
Point-Supervised Semantic Segmentation of Natural Scenes via Hyperspectral Imaging. 1357-1367 - Xinyuan Liu, Lingen Li, Lin Zhu, Lizhi Wang:
Computational Spectral Imaging with Unified Encoding Model and Beyond. 1368-1378 - Zhendong Yang, Zhe Li, Ailing Zeng, Zexian Li, Chun Yuan, Yu Li:
ViTKD: Feature-based Knowledge Distillation for Vision Transformers. 1379-1388 - Qi Bi, Shaodi You, Theo Gevers:
Generalized Foggy-Scene Semantic Segmentation by Frequency Decoupling. 1389-1399 - Shi Mao, Chenming Wu, Ran Yi, Zhelun Shen, Liangjun Zhang, Wolfgang Heidrich
:
Generating Material-Aware 3D Models from Sparse Views. 1400-1409 - Marius Dufraisse, Marcela Carvalho, Pauline Trouvé-Peloux, Frédéric Champagnat:
Physics Based Camera Privacy: Lens and Network Co-Design to the Rescue. 1410-1419 - Xiwen Chen, Wenhui Zhu, Peijie Qiu, Abolfazl Razi:
Imaging Signal Recovery Using Neural Network Priors Under Uncertain Forward Model Parameters. 1420-1429 - Jiaxi Lv, Yi Huang, Mingfu Yan, Jiancheng Huang, Jianzhuang Liu, Yifan Liu, Yafei Wen, Xiaoxin Chen, Shifeng Chen:
GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning. 1430-1440 - Zhi-Yi Lin, Bofan Lyu, Judith Cueto Fernandez, Eline van der Kruk, Ajay Seth
, Xucong Zhang:
3D Kinematics Estimation from Video with a Biomechanical Model and Synthetic Training Data. 1441-1450 - Yuka Ogino, Kazuya Kakizaki, Takahiro Toizumi, Atsushi Ito:
Outsmarting Biometric Imposters: Enhancing Iris-Recognition System Security through Physical Adversarial Example Generation and PAD Fine-Tuning. 1451-1461 - Ya-Chi Liang, Min-Xuan Qiu, Shang-Hong Lai:
FIQA-FAS: Face Image Quality Assessment Based Face Anti-Spoofing. 1462-1470 - Giuseppe Tarollo, Tomaso Fontanini, Claudio Ferrari, Guido Borghi, Andrea Prati:
Adversarial Identity Injection for Semantic Face Image Synthesis. 1471-1480 - Zijian Chen, Mei Wang, Weihong Deng, Hongzhi Shi, Dongchao Wen, Yingjie Zhang, Xingchen Cui, Jian Zhao:
Confidence-Aware RGB-D Face Recognition via Virtual Depth Synthesis. 1481-1489 - Jan Niklas Kolf, Naser Damer, Fadi Boutros:
GraFIQs: Face Image Quality Assessment Using Gradient Magnitudes. 1490-1499 - Nélida Mirabet-Herranz, Chiara Galdi, Jean-Luc Dugelay:
One Embedding to Predict Them All: Visible and Thermal Universal Face Representations for Soft Biometric Estimation via Vision Transformers. 1500-1509 - Haoyu Zhang, Raghavendra Ramachandra, Kiran B. Raja, Christoph Busch:
Generalized Single-Image-Based Morphing Attack Detection Using Deep Representations from Vision Transformer. 1510-1518 - Kagan Öztürk, Haiyu Wu, Kevin W. Bowyer:
Can the accuracy bias by facial hairstyle be reduced through balancing the training data? 1519-1528 - Lázaro Janier González-Soler, Maciej Salwowski, Christian Rathgeb, Daniel Fischer:
TattTRN: Template Reconstruction Network for Tattoo Retrieval. 1529-1538 - Folco Bertini Baldassini, Mustafa Shukor, Matthieu Cord, Laure Soulier, Benjamin Piwowarski:
What Makes Multimodal In-Context Learning Work? 1539-1550 - Hao Chen, Ran Tao, Han Zhang, Yidong Wang, Xiang Li, Wei Ye, Jindong Wang, Guosheng Hu, Marios Savvides:
Conv-Adapter: Exploring Parameter Efficient Transfer Learning for ConvNets. 1551-1561 - Övgü Özdemir, Erdem Akagündüz:
Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts. 1562-1571 - Gahyeon Kim, Sohee Kim, Seokju Lee:
AAPL: Adding Attributes to Prompt Learning for Vision-Language Models. 1572-1582 - Arnav M. Das, Ritwick Chaudhry, Kaustav Kundu, Davide Modolo:
Prompting Foundational Models for Omni-supervised Instance Segmentation. 1583-1592 - Maxime Zanella, Ismail Ben Ayed:
Low-Rank Few-Shot Adaptation of Vision-Language Models. 1593-1603 - Jorge Quesada, Mohammad Alotaibi, Mohit Prabhushankar, Ghassan AlRegib:
PointPrompt: A Multi-modal Prompting Dataset for Segment Anything Model. 1604-1610 - Diganta Misra, Muawiz Chaudhary, Agam Goyal, Bharat Runwal, Pin-Yu Chen:
Uncovering the Hidden Cost of Model Compression. 1611-1621 - Bedirhan Uguz, Ozhan Suat, Batuhan Karagöz, Emre Akbas:
MoCap-to-Visual Domain Adaptation for Efficient Human Mesh Estimation from 2D Keypoints. 1622-1632 - Mara Levy, Abhinav Shrivastava:
V-VIPE: Variational View Invariant Pose Embedding. 1633-1642 - Md Mushfiqur Azam, Kevin Desai:
A Survey on 3D Egocentric Human Pose Estimation. 1643-1654 - Taegun An, Changhee Joo:
CycleGANAS: Differentiable Neural Architecture Search for CycleGAN. 1655-1664 - Konstanty Subbotko, Wojciech Jablonski, Piotr Bilinski:
The devil is in discretization discrepancy. Robustifying Differentiable NAS with Single-Stage Searching Protocol. 1665-1674 - Yi-Cheng Huang, Wei-Hua Li, Chih-Han Tsou, Jun-Cheng Chen, Chu-Song Chen:
UP-NAS: Unified Proxy for Neural Architecture Search. 1675-1684 - Tunhou Zhang, Shiyu Li, Hsin-Pai Cheng, Feng Yan, Hai Li, Yiran Chen:
CSCO: Connectivity Search of Convolutional Operators. 1685-1694 - Sofia Casarin, Oswald Lanz
, Sergio Escalera
:
GRASP-GCN: Graph-Shape Prioritization for Neural Architecture Search under Distribution Shifts. 1695-1703 - Tianxiao Gao, Li Guo, Shanwei Zhao, Peihan Xu, Yukun Yang, Xionghao Liu, Shihao Wang, Shiai Zhu, Dajiang Zhou:
QuantNAS: Quantization-aware Neural Architecture Search For Efficient Deployment On Mobile Device. 1704-1713 - Arushi Rai, Kyle Buettner, Adriana Kovashka:
Strategies to Leverage Foundational Model Knowledge in Object Affordance Grounding. 1714-1723 - Youcai Zhang, Xinyu Huang, Jinyu Ma, Zhaoyang Li, Zhaochuan Luo, Yanchun Xie, Yuzhuo Qin, Tong Luo, Yaqian Li, Shilong Liu, Yandong Guo, Lei Zhang:
Recognize Anything: A Strong Image Tagging Model. 1724-1732 - Avinash Madasu, Vasudev Lal:
ICSVR: Investigating Compositional and Syntactic Understanding in Video Retrieval Models. 1733-1743 - James Seale Smith, Yen-Chang Hsu, Zsolt Kira, Yilin Shen, Hongxia Jin:
Continual Diffusion with STAMINA: STack-And-Mask INcremental Adapters. 1744-1754 - Gong Zhang, Kai Wang, Xingqian Xu, Zhangyang Wang, Humphrey Shi:
Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models. 1755-1764 - Junchi Wang, Lei Ke:
LLM-Seg: Bridging Image Segmentation and Large Language Model Reasoning. 1765-1774 - Jiachen Li, Jitesh Jain, Humphrey Shi:
Matting Anything. 1775-1785 - Madeline Chantry Schiappa, Shehreen Azad, Sachidanand VS, Yunhao Ge, Ondrej Miksik, Yogesh S. Rawat, Vibhav Vineet:
Robustness Analysis on Foundational Segmentation Models. 1786-1796 - Madeline Schiappa, Raiyaan Abdullah, Shehreen Azad, Jared Claypoole, Michael Cogswell, Ajay Divakaran, Yogesh S. Rawat:
Probing Conceptual Understanding of Large Visual-Language Models. 1797-1807 - Byoungjip Kim, Dasol Hwang, Sungjun Cho, Youngsoo Jang, Honglak Lee, Moontae Lee:
Show, Think, and Tell: Thought-Augmented Fine-Tuning of Large Language Models for Video Captioning. 1808-1817 - Davide Caffagni, Federico Cocchi, Nicholas Moratelli, Sara Sarto
, Marcella Cornia, Lorenzo Baraldi
, Rita Cucchiara:
Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs. 1818-1826 - Zhenlin Xu, Yi Zhu, Siqi Deng, Abhay Mittal, Yanbei Chen, Manchen Wang, Paolo Favaro, Joseph Tighe, Davide Modolo:
Benchmarking Zero-Shot Recognition with Vision-Language Models: Challenges on Granularity and Specificity. 1827-1836 - Kai Wang, Yapeng Tian, Dimitrios Hatzinakos:
Towards Efficient Audio-Visual Learners via Empowering Pre-trained Vision Transformers with Cross-Modal Adaptation. 1837-1846 - Mengxue Qu, Xiaodong Chen, Wu Liu, Alicia Li, Yao Zhao:
ChatVTG: Video Temporal Grounding via Chat with Video Dialogue Large Language Models. 1847-1856 - Muhammad Nawfal Meeran, Gokul Adethya T
, Bhanu Pratyush Mantha:
SAM-PM: Enhancing Video Camouflaged Object Detection using Spatio-Temporal Attention. 1857-1866 - Taeryung Lee, Fabien Baradel, Thomas Lucas, Kyoung Mu Lee, Grégory Rogez:
T2LM: Long-Term 3D Human Motion Generation from Multiple Sentences. 1867-1876 - Uttaran Bhattacharya, Aniket Bera, Dinesh Manocha:
Speech2UnifiedExpressions: Synchronous Synthesis of Co-Speech Affective Face and Body Expressions from Affordable Inputs. 1877-1887 - Jenny Sheng, Matthieu Lin, Andrew Zhao, Kevin Pruvost, Yu-Hui Wen, Yangguang Li, Gao Huang, Yong-Jin Liu:
Exploring Text-to-Motion Generation with Human Preference. 1888-1899 - Baiyi Li, Edmond S. L. Ho, Hubert P. H. Shum
, He Wang:
Two-Person Interaction Augmentation with Skeleton Priors. 1900-1910 - Mathis Petrovich, Or Litany, Umar Iqbal, Michael J. Black, Gül Varol, Xue Bin Peng, Davis Rempe:
Multi-Track Timeline Control for Text-Driven 3D Human Motion Generation. 1911-1921 - Steven Hogue, Chenxu Zhang, Hamza Daruger, Yapeng Tian, Xiaohu Guo:
DiffTED: One-shot Audio-driven TED Talk Video Generation with Diffusion-based Co-speech Gestures. 1922-1931 - Léore Bensabath, Mathis Petrovich, Gül Varol:
A Cross-Dataset Study for Text-based 3D Human Motion Retrieval. 1932-1940 - Pablo Ruiz-Ponce, Germán Barquero, Cristina Palmero, Sergio Escalera
, José García Rodríguez:
in2IN: Leveraging individual Information to Generate Human INteractions. 1941-1951 - Shivam Mehta, Anna Deichler, Jim O'Regan
, Birger Moëll, Jonas Beskow, Gustav Eje Henter, Simon Alexanderson:
Fake it to make it: Using synthetic data to remedy the data shortage in joint multi-modal speech-and-gesture synthesis. 1952-1964 - Ayush Ghadiya, Purbayan Kar, Vishal M. Chudasama, Pankaj Wasnik:
Cross-Modal Fusion and Attention Mechanism for Weakly Supervised Video Anomaly Detection. 1965-1974 - Zaber Ibn Abdul Hakim, Najibul Haque Sarker, Rahul Pratap Singh, Bishmoy Paul, Ali Dabouei, Min Xu:
Leveraging Generative Language Models for Weakly Supervised Sentence Component Analysis in Video-Language Joint Learning. 1975-1985 - Zhizhang Hu, Shasha Li, Ming Du, Arnab Dhua, Douglas Gray:
De-noised Vision-language Fusion Guided by Visual Cues for E-commerce Product Search. 1986-1996 - Jens Piekenbrinck, Alexander Hermans, Narunas Vaskevicius, Timm Linder, Bastian Leibe:
RGB-D Cube R-CNN: 3D Object Detection with Selective Modality Dropout. 1997-2006 - Yang Zhong, Bhiman Kumar Baghel:
Multimodal Understanding of Memes with Fair Explanations. 2007-2017 - Aviral Agrawal, Carlos Mateo Samudio Lezcano, Iqui Balam Heredia-Marin, Prabhdeep Singh Sethi:
Listen Then See: Video Alignment with Speaker Attention. 2018-2027 - Ankan Deria, Komal Kumar, Snehashis Chakraborty, Dwarikanath Mahapatra, Sudipta Roy:
InVERGe: Intelligent Visual Encoder for Bridging Modalities in Report Generation. 2028-2038 - Mengmeng Liu, Hao Cheng
, Lin Chen, Hellward Broszio, Jiangtao Li, Runjiang Zhao, Monika Sester
, Michael Ying Yang
:
LAformer: Trajectory Prediction for Autonomous Driving with Lane-Aware Scene Constraints. 2039-2049 - Tonmoay Deb, Lichen Wang, Zachary Bessinger, Naji Khosravan, Eric Penner, Sing Bing Kang:
ZInD-Tell: Towards Translating Indoor Panoramas into Descriptions. 2050-2059 - Yi-Shan Lee, Wei-Cheng Tseng, Fu-En Wang, Min Sun:
VMCML: Video and Music Matching via Cross-Modality Lifting. 2060-2069 - Niyati Rawal, Roberto Bigazzi, Lorenzo Baraldi
, Rita Cucchiara:
AIGeN: An Adversarial Approach for Instruction Generation in VLN. 2070-2080 - Anusha Devulapally, Md Fahim Faysal Khan, Siddharth Advani, Vijaykrishnan Narayanan:
Multi-Modal Fusion of Event and RGB for Monocular Depth Estimation Using a Unified Transformer-based Architecture. 2081-2089 - Yuhan Shen, Linjie Yang, Longyin Wen, Haichao Yu, Ehsan Elhamifar, Heng Wang:
Exploring the Role of Audio in Video Captioning. 2090-2100 - Tse-Wei Chen, Wei Tao, Dongyue Zhao, Kazuhiro Mima, Tadayuki Ito, Kinya Osa, Masami Kato:
Dedicated Inference Engine and Binary-Weight Neural Networks for Lightweight Instance Segmentation. 2101-2110 - Carlos Victorino Padeiro, Tse-Wei Chen, Takahiro Komamizu, Ichiro Ide:
Lightweight Maize Disease Detection through Post-Training Quantization with Similarity Preservation. 2111-2120 - Sam Leroux, Stijn Vanassche, Pieter Simoens:
Multi-bit, Black-box Watermarking of Deep Neural Networks in Embedded Applications. 2121-2130 - Lukas Frickenstein, Pierpaolo Morì, Shambhavi Balamuthu Sampath, Moritz Thoma, Nael Fasfous, Manoj Rohit Vemparala, Alexander Frickenstein, Christian Unger, Claudio Passerone, Walter Stechele:
Pruning as a Binarization Technique. 2131-2140 - Manon Dampfhoffer, Thomas Mesquida:
Neuromorphic Lip-Reading with Signed Spiking Gated Recurrent Units. 2141-2151 - Cevahir Çigla:
Efficient Video Stabilization via Partial Block Phase Correlation on Edge GPUs. 2152-2161 - Jamie Menjay Lin, Jisoo Jeong, Hong Cai, Risheek Garrepalli, Kai Wang, Fatih Porikli
:
SciFlow: Empowering Lightweight Optical Flow Models with Self-Cleaning Iterations. 2162-2171 - Francesco Paissan, Davide Nadalini, Manuele Rusci, Alberto Ancilotto, Francesco Conti, Luca Benini, Elisabetta Farella:
Structured Sparse Back-propagation for Lightweight On-Device Continual Learning on Microcontroller Units. 2172-2181 - Luca Bompani, Manuele Rusci, Daniele Palossi, Francesco Conti, Luca Benini:
Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems. 2182-2190 - Raz Ramon, Hadar Cohen Duwek, Elishai Ezra Tsur:
ED-DCFNet: an unsupervised encoder-decoder neural model for event-driven feature extraction and object tracking. 2191-2199 - Anamika Jha, Aratrik Chattopadhyay, Mrinal Banerji, Disha Jain:
RAVN: Reinforcement Aided Adaptive Vector Quantization of Deep Neural Networks. 2200-2209 - Parakh Agarwal, Manu Mathew, Kunal Ranjan Patel, Varun Tripathi, Pramod Swami:
Prune Efficiently by Soft Pruning. 2210-2217 - Omkar Prabhune, Tianen Chen, Younghyun Kim:
Content-aware Input Scaling and Deep Learning Computation Offloading for Low-Latency Embedded Vision. 2218-2226 - Tristan Maidment, Purav J. Patel, Erin Walker, Adriana Kovashka:
Using Language-Aligned Gesture Embeddings for Understanding Gestures Accompanying Math Terms. 2227-2237 - Claudia Cuttano, Gabriele Rosi, Gabriele Trivigno, Giuseppe Averta:
What does CLIP know about peeling a banana? 2238-2247 - Feipeng Ma, Yizhou Zhou, Yueyi Zhang, Siying Wu, Zheyu Zhang, Zilong He, Fengyun Rao, Xiaoyan Sun:
Task Navigator: Decomposing Complex Tasks for Multimodal Large Language Models. 2248-2257 - Anas Zafar, Danyal Aftab, Rizwan Qureshi, Yaofeng Wang, Hong Yan:
Multi-Explainable TemporalNet: An Interpretable Multimodal Approach using Temporal Convolutional Network for User-level Depression Detection. 2258-2265 - Md. Adnan Arefeen, Biplob Debnath, Md. Yusuf Sarwar Uddin, Srimat Chakradhar:
ViTA: An Efficient Video-to-Text Algorithm using VLM for RAG-based Video Analysis System. 2266-2274 - Fiona R. Kolbinger, Jiangpeng He, Jinge Ma, Fengqing Zhu:
Strategies to Improve Real-World Applicability of Laparoscopic Anatomy Segmentation Models. 2275-2284 - Wenhui Zhu, Peijie Qiu, Xiwen Chen, Xin Li, Natasha Leporé, Oana M. Dumitrascu, Yalin Wang:
nnMobileNet: Rethinking CNN for Retinopathy Research. 2285-2294 - Sontje Ihler, Felix Kuhnke, Timo Kuhlgatz
, Thomas Seel
:
Distribution-Aware Multi-Label FixMatch for Semi-Supervised Learning on CheXpert. 2295-2304 - Isabella Poles, Eleonora D'Arnese
, Luca G. Cellamare, Marco D. Santambrogio, Darvin Yi:
Repurposing the Image Generative Potential: Exploiting GANs to Grade Diabetic Retinopathy. 2305-2314 - Abril Corona-Figueroa, Hubert P. H. Shum
, Chris G. Willcocks:
Repeat and Concatenate: 2D to 3D Image Translation with 3D to 3D Generative Modeling. 2315-2324 - Vanshali Sharma, Abhishek Kumar, Debesh Jha, Manas Kamal Bhuyan, Pradip K. Das, Ulas Bagci:
ControlPolypNet: Towards Controlled Colon Polyp Synthesis for Improved Polyp Segmentation. 2325-2334 - Sojung Go, Younghoon Ji, Sang Jun Park, Soochahn Lee:
Generation of Structurally Realistic Retinal Fundus Images with Diffusion Models. 2335-2344 - Yumnah Hasan, Talhat Khan, Darian Reyes Fernández de Bulnes, Juan F. H. Albarracín, Conor Ryan:
A Comparative Analysis of Implicit Augmentation Techniques for Breast Cancer Diagnosis Using Multiple Views. 2345-2354 - Jonas Hein, Frédéric Giraud, Lilian Calvet, Alexander Schwarz, Nicola Alessandro Cavalcanti, Sergey Prokudin, Mazda Farshad, Siyu Tang, Marc Pollefeys, Fabio Carrillo
, Philipp Fürnstahl:
Creating a Digital Twin of Spinal Surgery: A Proof of Concept. 2355-2364 - Ekaterina Redekop, Mara Pleasure, Zichen Wang, Karthik V. Sarma, Adam Kinnaird, William Speier, Corey W. Arnold:
Codebook VQ-VAE Approach for Prostate Cancer Diagnosis using Multiparametric MRI. 2365-2372 - Divya D. Reddy, Niloufar Saadat, James M. Holcomb, Benjamin C. Wagner, Nghi C. Truong, Jason Bowerman, Kimmo J. Hatanpaa, Toral R. Patel, Marco C. Pinho, Ananth J. Madhuranthakam, Chandan Ganesh Bangalore Yogananda, Joseph A. Maldjian:
Advancing Brain Tumor Analysis: Curating a High-Quality MRI Dataset for Deep Learning-Based Molecular Marker Profiling. 2373-2379 - Adway U. Kanhere, Pranav Kulkarni
, Paul H. Yi, Vishwa S. Parekh:
Privacy-Preserving Collaboration for Multi-Organ Segmentation via Federated Learning from Sites with Partial Labels. 2380-2387 - Roger D. Soberanis-Mukul, Jiahuan Cheng, Jan Emily Mangulabnan, S. Swaroop Vedula, Masaru Ishii, Gregory D. Hager, Russell H. Taylor, Mathias Unberath:
GSAM+Cutie: Text-Promptable Tool Mask Annotation for Endoscopic Video. 2388-2394 - Tiago Mota, Maria Rita Verdelho, Diogo J. Araújo, Alceu Bissoto, Carlos Santiago, Catarina Barata:
MMIST-ccRCC: A Real World Medical Dataset for the Development of Multi-Modal Systems. 2395-2403 - Sophie Fischer, Irina Voiculescu:
Hairy Ground Truth Enhancement for Semantic Segmentation. 2404-2412 - François Lecomte, Pablo Alvarez, Stéphane Cotin, Jean-Louis Dillenseger:
Beyond respiratory models: a physics-enhanced synthetic data generation method for 2D-3D deformable registration. 2413-2421 - Florian Ramakers, Tom Vercauteren
, Jan Deprest, Helena Williams:
UltraAugment: Fan-shape and Artifact-based Data Augmentation for 2D Ultrasound Images. 2422-2431 - Gemma Canet Tarres, Dan Ruta, Tu Bui, John P. Collomosse:
PARASOL: Parametric Style Control for Diffusion Image Synthesis. 2432-2442 - Xinye Wanyan, Sachith Seneviratne, Shuchang Shen, Michael Kirley:
Extending global-local view alignment for self-supervised learning with remote sensing imagery. 2443-2453 - Mehwish Mehmood
, Majed Alsharari
, Shahzaib Iqbal
, Ivor T. A. Spence, Muhammad Fahim:
RetinaLiteNet: A Lightweight Transformer based CNN for Retinal Feature Segmentation. 2454-2463 - Taiba Majid Wani
, Reeva Gulzar, Irene Amerini:
ABC-CapsNet: Attention based Cascaded Capsule Network for Audio Deepfake Detection. 2464-2472 - Mallika Garg, Debashis Ghosh, Pyari Mohan Pradhan:
GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition. 2473-2483 - Yingchao Huang, Abdul Bais:
Unsupervised Domain Adaptation for Weed Segmentation Using Greedy Pseudo-labelling. 2484-2494 - Anant Khandelwal:
RePoseDM: Recurrent Pose Alignment and Gradient Guidance for Pose Guided Image Synthesis. 2495-2504 - Krishnakant Singh, Thanush Navaratnam, Jannik Holmer, Simone Schaub-Meyer, Stefan Roth:
Is Synthetic Data all We Need? Benchmarking the Robustness of Models Trained with Synthetic Images. 2505-2515 - Anant Khandelwal:
FloCoDe: Unbiased Dynamic Scene Graph Generation with Temporal Consistency and Correlation Debiasing. 2516-2526 - Jose M. Rojas Chaves, Subarna Tripathi:
VideoSAGE: Video Summarization with Graph Representation Learning. 2527-2534 - Chaoyi Zhang, Xitong Yang, Ji Hou, Kris Kitani, Weidong Cai, Fu-Jen Chu:
EgoSG: Learning 3D Scene Graphs from Egocentric RGB-D Sequences. 2535-2545 - Ming Cheng, Ziyi Zhou, Bowen Zhang, Ziyu Wang, Jiaqi Gan, Ziang Ren, Weiqi Feng, Yi Lyu, Hefan Zhang
, Xingjian Diao:
Efflex: Efficient and Flexible Pipeline for Spatio-Temporal Trajectory Graph Modeling and Representation Learning. 2546-2555 - Congrui Hetang, Haoru Xue, Cindy X. Le, Tianwei Yue, Wenping Wang, Yihui He:
Segment Anything Model for Road Network Graph Extraction. 2556-2566 - Julian Lorenz, Robin Schön
, Katja Ludwig, Rainer Lienhart:
A Review and Efficient Implementation of Scene Graph Generation Metrics. 2567-2575 - Abdelhak Lemkhenter, Manchen Wang, Luca Zancato, Gurumurthy Swaminathan, Paolo Favaro, Davide Modolo:
SemiGPC: Distribution-Aware Label Refinement for Imbalanced Semi-Supervised Learning Using Gaussian Processes. 2576-2585 - Karim Guirguis, George Eskandar, Mingyang Wang, Matthias Kayser, Eduardo Monari, Bin Yang, Jürgen Beyerer:
Uncertainty-based Forgetting Mitigation for Generalized Few-Shot Object Detection. 2586-2595 - Giacomo Nebbia, Adriana Kovashka:
Image-caption difficulty for efficient weakly-supervised object detection from in-the-wild data. 2596-2605 - Qiangqiang Wu
, Antoni B. Chan:
Learning Tracking Representations from Single Point Annotations. 2606-2615 - Sai Bhargav Rongali, Sarthak Mehrotra, Ankit Jha, Mohamad Hassan N C, Shirsha Bose, Tanisha Gupta, Mainak Singha, Biplab Banerjee:
CDAD-Net: Bridging Domain Gaps in Generalized Category Discovery. 2616-2626 - David Kurzendörfer, Otniel-Bogdan Mercea, A. Sophia Koepke, Zeynep Akata:
Audio-Visual Generalized Zero-Shot Learning using Pre-Trained Large Multi-Modal Models. 2627-2638 - Pengxiao Han, Changkun Ye, Jieming Zhou, Jing Zhang, Jie Hong, Xuesong Li
:
Latent-based Diffusion Model for Long-tailed Recognition. 2639-2648 - Fei Pan, Xu Yin, Seokju Lee, Axi Niu, Sung-Eui Yoon, In So Kweon:
MoDA: Leveraging Motion Priors from Videos for Advancing Unsupervised Domain Adaptation in Semantic Segmentation. 2649-2658 - Tarun Ram Menta, Surgan Jandial, Akash Patil, Saketh Bachu, Vimal K. B., Balaji Krishnamurthy, Vineeth N. Balasubramanian, Mausoom Sarkar, Chirag Agarwal:
Active Transferability Estimation. 2659-2670 - Shuaiyi Huang, De-An Huang, Zhiding Yu, Shiyi Lan, Subhashree Radhakrishnan, José M. Álvarez, Abhinav Shrivastava, Anima Anandkumar:
What is Point Supervision Worth in Video Instance Segmentation? 2671-2681 - Shuaiyi Huang, Saksham Suri, Kamal Gupta, Sai Saketh Rambhatla, Ser-Nam Lim, Abhinav Shrivastava:
UVIS: Unsupervised Video Instance Segmentation. 2682-2692 - Tarun Kalluri, Weiyao Wang, Heng Wang, Manmohan Chandraker, Lorenzo Torresani, Du Tran:
Open-world Instance Segmentation: Top-down Learning with Bottom-up Supervision. 2693-2703 - Xin Hu, Kai Li, Deep Patel, Erik Kruus, Martin Renqiang Min, Zhengming Ding:
Weakly-Supervised Temporal Action Localization with Multi-Modal Plateau Transformers. 2704-2713 - Adele Myers, Nina Miolane:
On Accuracy and Speed of Geodesic Regression: Do Geometric Priors Improve Learning on Small Datasets? 2714-2722 - Scarlett Raine
, Ross Marchant, Brano Kusy, Frédéric Maire, Niko Sünderhauf
, Tobias Fischer
:
Human-in-the-Loop Segmentation of Multi-species Coral Imagery. 2723-2732 - Yuxiang Huang, Yuhao Chen, John S. Zelek:
Zero-Shot Monocular Motion Segmentation in the Wild by Combining Deep Learning with Geometric Motion Model Fusion. 2733-2743 - Zhuohong Li, Fangxiao Lu, Jiaqi Zou, Lei Hu, Hongyan Zhang
:
Generalized Few-Shot Meets Remote Sensing: Discovering Novel Classes in Land Cover Mapping via Hybrid Semantic Segmentation Framework. 2744-2754 - Steve Andreas Immanuel, Hagai Raja Sinulingga:
Learnable Prompt for Few-Shot Semantic Segmentation in Remote Sensing Domain. 2755-2761 - Shihong Wang, Ruixun Liu, Kaiyu Li, Jiawei Jiang, Xiangyong Cao:
Class Similarity Transition: Decoupling Class Similarities and Imbalance from Generalized Few-shot Segmentation. 2762-2770 - Tianyi Gao, Wei Ao, Xing-ao Wang, Yuanhao Zhao, Ping Ma, Mengjie Xie, Hang Fu, Jinchang Ren, Zhi Gao:
Enrich, Distill and Fuse: Generalized Few-Shot Semantic Segmentation in Remote Sensing Leveraging Foundation Model's Assistance. 2771-2780 - Jintao Tong, Haichen Zhou, Yicong Liu, Yiman Hu, Yixiong Zou:
Dynamic Knowledge Adapter with Probabilistic Calibration for Generalized Few-Shot Semantic Segmentation. 2781-2790 - Dharmendra Selvaratnam, Dena Bazazian:
Localised-NeRF: Specular Highlights and Colour Gradient Localising in NeRF. 2791-2801 - Ruiyang Chen, Mohan Yin, Jiawei Shen, Wei Ma:
Recon3D: High Quality 3D Reconstruction from a Single Image Using Generated Back-View Explicit Priors. 2802-2811 - Arnab Dey, Di Yang, Rohith Agaram, Antitza Dantcheva, Andrew I. Comport, Srinath Sridhar, Jean Martinet:
GHNeRF: Learning Generalizable Human Features with Efficient Neural Radiance Fields. 2812-2821 - Lukas Radl, Andreas Kurz, Michael Steiner, Markus Steinberger:
Analyzing the Internals of Neural Radiance Fields. 2822-2831 - Georgios Kouros
, Minye Wu, Sushruth Nagesh, Xianling Zhang, Tinne Tuytelaars
:
Unveiling the Ambiguity in Neural Inverse Rendering: A Parameter Compensation Analysis. 2832-2841 - Pou-Chun Kung, Seth Isaacson, Ram Vasudevan, Katherine A. Skinner:
SAD-GS: Shape-aligned Depth-supervised Gaussian Splatting. 2842-2851 - Rahul Dey, Bernhard Egger, Vishnu Naresh Boddeti, Ye Wang, Tim K. Marks:
CoLa-SDF: Controllable Latent StyleSDF for Disentangled 3D Face Generation. 2852-2861 - Vincent Cartillier, Grant Schindler, Irfan Essa:
SLAIM: Robust Dense Neural SLAM for Online Tracking and Mapping. 2862-2871 - Wenyan Cong, Hanxue Liang, Zhiwen Fan, Peihao Wang, Yifan Jiang, Dejia Xu, A. Cengiz Öztireli, Zhangyang Wang:
NeRF as Pretraining at Scale: Generalizable 3D-Aware Semantic Representation Learning from View Prediction. 2872-2882 - Dylan Campbell
, Eldar Insafutdinov, João F. Henriques, Andrea Vedaldi:
Neural Fields for Co-Reconstructing 3D Objects from Incidental 2D Data. 2883-2893 - Yuwei Chen, Shiyong Chu:
Large Language Models in Wargaming: Methodology, Application, and Robustness. 2894-2903 - Hung-Jui Wang, Yu-Yu Wu, Shang-Tse Chen:
Enhancing Targeted Attack Transferability via Diversified Weight Pruning. 2904-2914 - Xinwei Zhang, Tianyuan Zhang, Yitong Zhang, Shuangcheng Liu:
Enhancing the Transferability of Adversarial Attacks with Stealth Preservation. 2915-2925 - Chen Wang, Angtian Wang, Junbo Li, Alan L. Yuille, Cihang Xie:
Benchmarking Robustness in Neural Radiance Fields. 2926-2936 - Muchao Ye, Xiang Xu, Qin Zhang, Jonathan Wu:
Sharpness-Aware Optimization for Real-World Adversarial Attacks for Diverse Compute Platforms with Enhanced Transferability. 2937-2946 - Krzysztof Jankowski, Bartlomiej Sobieski, Mateusz Kwiatkowski, Jakub Szulc, Michal Janik, Hubert Baniecki, Przemyslaw Biecek:
Red-Teaming Segment Anything Model. 2947-2956 - SangHwa Hong:
Learning to Schedule Resistant to Adversarial Attacks in Diffusion Probabilistic Models Under the Threat of Lipschitz Singularities. 2957-2966 - Furkan Mumcu, Yasin Yilmaz:
Multimodal Attack Detection for Action Recognition Models. 2967-2976 - Femina Senjaliya, Melissa Cote, Amanda Dash, Alexandra Branzan Albu, Andrea Niemi, Stéphane Gauthier, Julek Chawarski, Steve Pearce, Kaan Ersahin, Keath Borg:
Deep Learning-Based Identification of Arctic Ocean Boundaries and Near-Surface Phenomena in Underwater Echograms. 2977-2986 - Maksim Kukushkin
, Martin Bogdan, Thomas Schmid:
BiMAE - A Bimodal Masked Autoencoder Architecture for Single-Label Hyperspectral Image Classification. 2987-2996 - Afnan Althoupety, Li-Yun Wang, Wu-Chi Feng, Banafsheh Rekabdar:
DaFF: Dual Attentive Feature Fusion for Multispectral Pedestrian Detection. 2997-3006 - Amogh Joshi, Nikhil Akalwadi, Chinmayee Mandi, Chaitra Desai, Ramesh Ashok Tabib, Ujwala Patil, Uma Mudenagudi:
HNN: Hierarchical Noise-Deinterlace Net Towards Image Denoising. 3007-3016 - Zhuocheng Jiang, Yangmin Ding, Junhui Zhao, Yue Tian, Shaobo Han, Sarper Ozharar, Ting Wang, James M. Moore:
Seeing the Vibration from Fiber-Optic Cables: Rain Intensity Monitoring using Deep Frequency Filtering. 3017-3026 - Cyprien Arnold, Philippe Jouvet, Lama Seoud:
SwinFuSR: an image fusion-inspired model for RGB-guided thermal image super-resolution. 3027-3036 - Kevin Helvig
, Baptiste Abeloos, Pauline Trouvé-Peloux:
CAFF-DINO: Multi-spectral object detection transformers with cross-attention features fusion. 3037-3046 - Anja Sheppard, Jason Brown, Nilton O. Renno, Katherine A. Skinner:
Learning Surface Terrain Classifications from Ground Penetrating Radar. 3047-3055 - Weilong Guo, Shengyang Li, Jian Yang:
Scattering Prompt Tuning: A Fine-tuned Foundation Model for SAR Object Recognition. 3056-3065 - Jun Yu, Keda Lu, Shenshen Du, Lin Xu, Peng Chang, Houde Liu, Bin Lan, Tianyu Liu:
MvAV-pix2pixHD: Multi-view Aerial View Image Translation. 3066-3075 - Hongcheng Jiang, ZhiQiang Chen:
Flexible Window-based Self-attention Transformer in Thermal Image Super-Resolution. 3076-3085 - Raghunath Sai Puttagunta, Birendra Kathariya, Zhu Li, George York:
Multi-Scale Feature Fusion using Channel Transformers for Guided Thermal Image Super Resolution. 3086-3095 - Spencer Low, Oliver Nina, Dylan Bowald, Angel Domingo Sappa, Nathan Inkawhich, Peter Bruns:
Multi-modal Aerial View Image Challenge: Sensor Domain Translation. 3096-3104 - Spencer Low, Oliver Nina, Dylan Bowald, Angel Domingo Sappa, Nathan Inkawhich, Peter Bruns:
Multi-modal Aerial View Image Challenge: SAR Classification. 3105-3112 - Rafael E. Rivadeneira, Angel Domingo Sappa, Chenyang Wang, Junjun Jiang, Zhiwei Zhong, Peilin Chen
, Shiqi Wang:
Thermal Image Super-Resolution Challenge Results - PBVS 2024. 3113-3122 - Carlos Cortés-Mendez, Jean-Bernard Hayet:
Exploring the usage of diffusion models for thermal image super-resolution: a generic, uncertainty-aware approach for guided and non-guided schemes. 3123-3130 - Christian Mayr, Christian Kübler, Norbert Haala, Michael Teutsch:
Narrowing the Synthetic-to-Real Gap for Thermal Infrared Semantic Image Segmentation Using Diffusion-based Conditional Image Synthesis. 3131-3141 - Yona Falinie A. Gaus, Neelanjan Bhowmik, Brian K. S. Isaac-Medina, Toby P. Breckon:
Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery. 3142-3152 - Abel A. Reyes Angulo, Sidike Paheding:
Forward-Forward Algorithm for Hyperspectral Image Classification. 3153-3161 - Isaac Corley, Caleb Robinson, Rahul Dodhia, Juan M. Lavista Ferres, Peyman Najafirad:
Revisiting pre-trained remote sensing model benchmarks: resizing and normalization matters. 3162-3172 - Ivan DeAndres-Tame
, Ruben Tolosana
, Pietro Melzi, Rubén Vera-Rodríguez, Minchul Kim, Christian Rathgeb, Xiaoming Liu, Aythami Morales, Julian Fiérrez, Javier Ortega-Garcia, Zhizhou Zhong, Yuge Huang, Yuxi Mi, Shouhong Ding, Shuigeng Zhou, Shuai He, Lingzhi Fu, Heng Cong, Rongyu Zhang, Zhihong Xiao, Evgeny Smirnov, Anton Pimenov, Aleksei Grigorev, Denis Timoshenko, Kaleb Mesfin Asfaw, Cheng-Yaw Low, Hao Liu, Chuyi Wang, Qing Zuo, Zhixiang He, Hatef Otroshi-Shahreza, Anjith George, Alexander Unnervik, Parsa Rahimi, Sébastien Marcel, Pedro C. Neto, Marco Huber, Jan Niklas Kolf, Naser Damer, Fadi Boutros, Jaime S. Cardoso, Ana Filipa Sequeira, Andrea Atzori
, Gianni Fenu, Mirko Marras, Vitomir Struc, Jiang Yu, Zhangjie Li, Jichun Li, Weisong Zhao, Zhen Lei, Xiangyu Zhu
, Xiaoyu Zhang, Bernardo Biesseck, Pedro Vidal, Luiz Coelho, Roger Granada, David Menotti:
Second Edition FRCSyn Challenge at CVPR 2024: Face Recognition Challenge in the Era of Synthetic Data. 3173-3183 - Jianwei Li, Jun Xue, Rui Cao, Xiaoxia Du, Siyu Mo, Kehao Ran, Zeyan Zhang:
FineRehab: A Multi-modality and Multi-task Dataset for Rehabilitation Analysis. 3184-3193 - Takeshi Kaneko, Rei Kawakami, Takeshi Naemura, Nakamasa Inoue:
Augmenting Pass Prediction via Imitation Learning in Soccer Simulations. 3194-3203 - Lauren Okamoto, Paritosh Parmar:
Hierarchical NeuroSymbolic Approach for Comprehensive and Explainable Action Quality Assessment. 3204-3213 - Calvin Yeung, Kenjiro Ide, Keisuke Fujii:
AutoSoccerPose: Automated 3D posture Analysis of Soccer Shot Movements. 3214-3224 - Farzaneh Askari, Cyril Yared, Rohit Ramaprasad, Devin Garg, Anjun Hu, James J. Clark:
Video Interaction Recognition using an Attention Augmented Relational Network and Skeleton Data. 3225-3234 - Maria Koshkina, James H. Elder:
A General Framework for Jersey Number Recognition in Sports Video. 3235-3244 - Fahad Majeed
, Nauman Ullah Gilal, Khaled A. Al-Thelaya, Yin Yang, Marco Agus, Jens Schneider:
MV-Soccer: Motion-Vector Augmented Instance Segmentation for Soccer Player Tracking. 3245-3255 - Naoki Nonaka, Ryo Fujihira, Toshiki Koshiba, Akira Maeda, Jun Seita:
Rugby Scene Classification Enhanced by Vision Language Model. 3256-3266 - Jan Held, Hani Itani, Anthony Cioppa, Silvio Giancola, Bernard Ghanem
, Marc Van Droogenbroeck:
X-VARS: Introducing Explainability in Football Refereeing with Multi-Modal Large Language Models. 3267-3279 - Arnaud Leduc, Anthony Cioppa, Silvio Giancola, Bernard Ghanem
, Marc Van Droogenbroeck:
SoccerNet-Depth: a Scalable Dataset for Monocular Depth Estimation in Sports Videos. 3280-3282 - Vladimir Somers, Victor Joos, Anthony Cioppa, Silvio Giancola, Seyed Abolfazl Ghasemzadeh, Floriane Magera, Baptiste Standaert, Amir M. Mansourian, Xin Zhou, Shohreh Kasaei
, Bernard Ghanem
, Alexandre Alahi, Marc Van Droogenbroeck, Christophe De Vleeschouwer:
SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap. 3293-3305 - Robbe Decorte, Martin Paré, Jelle Vanhaeverbeke, Joachim Taelman, Maarten Slembrouck, Steven Verstockt:
Multi-Modal Hit Detection and Positional Analysis in Padel Competitions. 3306-3314 - Tomohiro Suzuki, Ryota Tanaka, Kazuya Takeda, Keisuke Fujii:
Pseudo-label based unsupervised fine-tuning of a monocular 3D pose estimation model for sports motions. 3315-3324 - Marc Gutiérrez-Pérez, Antonio Agudo:
No Bells, Just Whistles: Sports Field Registration by Leveraging Geometric Properties. 3325-3334 - Floriane Magera, Thomas Hoyoux, Olivier Barnich, Marc Van Droogenbroeck:
A Universal Protocol to Benchmark Camera Calibration for Sports. 3335-3346 - Thomas Gossard, Julian Krismer, Andreas Ziegler, Jonas Tebbe, Andreas Zell:
Table tennis ball spin estimation with an event camera. 3347-3356 - Atom Scott, Ikuma Uchida, Ning Ding, Rikuhei Umemoto, Rory P. Bunker, Ren Kobayashi, Takeshi Koyama, Masaki Onishi, Yoshinari Kameda, Keisuke Fujii:
TeamTrack: A Dataset for Multi-Sport Multi-Object Tracking in Full-pitch Videos. 3357-3366 - Takuya Nakabayashi, Kyota Higa, Masahiro Yamaguchi, Ryo Fujiwara, Hideo Saito:
Event-based Ball Spin Estimation in Sports. 3367-3375 - Magnus Ibh, Stella Graßhof
, Dan Witzner Hansen:
A stroke of genius: Predicting the next move in badminton. 3376-3385 - Bruno Cabado, Anthony Cioppa, Silvio Giancola, Andrés Villa, Bertha Guijarro-Berdiñas, Emilio J. Padrón, Bernard Ghanem
, Marc Van Droogenbroeck:
Beyond the Premier: Assessing Action Spotting Transfer Capability Across Diverse Domains. 3386-3398 - Altaf Hussain, Noman Khan, Muhammad Munsif
, Min Je Kim, Sung Wook Baik:
Medium Scale Benchmark for Cricket Excited Actions Understanding. 3399-3409 - Artur Xarles, Sergio Escalera
, Thomas B. Moeslund
, Albert Clapés
:
T-DEED: Temporal-Discriminability Enhancer Encoder-Decoder for Precise Event Spotting in Sports Videos. 3410-3419 - Jerrin Bright, Bavesh Balaji, Yuhao Chen, David A. Clausi, John S. Zelek:
PitcherNet: Powering the Moneyball Evolution in Baseball Video Analytics. 3420-3429 - Ahmed Qazi, Asim Iqbal:
ExerAIde: AI-assisted Multimodal Diagnosis for Enhanced Sports Performance and Personalised Rehabilitation. 3430-3438 - Hasan Abed Al Kader Hammoud, Shuming Liu, Mohammed Alkhrashi, Fahad Albalawi, Bernard Ghanem:
Look, Listen, and Attack: Backdoor Attacks Against Video Action Recognition. 3439-3450 - Nicola Franco, Jeanette Miriam Lorenz, Karsten Roscher, Stephan Günnemann:
Understanding ReLU Network Robustness Through Test Set Certification Performance. 3451-3460 - Marion Neumeier, Sebastian Dorn, Michael Botsch
, Wolfgang Utschick:
Reliable Trajectory Prediction and Uncertainty Quantification with Conditioned Diffusion Models. 3461-3470 - Ziliang Xiong, Arvi Jonnarth
, Abdelrahman Eldesokey, Joakim Johnander, Bastian Wandt, Per-Erik Forssén:
Hinge-Wasserstein: Estimating Multimodal Aleatoric Uncertainty in Regression Tasks. 3471-3480 - Jing Li, Zigan Wang, Jinliang Li:
AdvDenoise: Fast Generation Framework of Universal and Robust Adversarial Patches Using Denoise. 3481-3490 - Maximilian Dreyer, Reduan Achtibat, Wojciech Samek, Sebastian Lapuschkin:
Understanding the (Extra-)Ordinary: Validating Deep Model Decisions with Prototypical Concept-based Explanations. 3491-3501 - Syed Sha Qutub, Michael Paulitsch, Kay-Ulrich Scholl, Neslihan Köse Cihangir, Korbinian Hagn, Fabian Oboril, Gereon Hinz, Alois Knoll:
Situation Monitor: Diversity-Driven Zero-Shot Out-of-Distribution Detection using Budding Ensemble Architecture for Object Detection. 3502-3511 - Paul Melki, Lionel Bombrun, Boubacar Diallo, Jérôme Dias, Jean-Pierre Da Costa:
The Penalized Inverse Probability Measure for Conformal Classification. 3512-3521 - Hakan Yekta Yatbaz, Mehrdad Dianati
, Konstantinos Koufos, Roger Woodman
:
Run-time Monitoring of 3D Object Detection in Automated Driving Systems Using Early Layer Neural Activation Patterns. 3522-3531 - Dilyara Bareeva, Maximilian Dreyer, Frederik Pahde, Wojciech Samek, Sebastian Lapuschkin:
Reactive Model Correction: Mitigating Harm to Task-Relevant Features via Conditional Bias Suppression. 3532-3541 - Pallavi Mitra, Gesina Schwalbe, Nadja Klein:
Investigating Calibration and Corruption Robustness of Post-hoc Pruned Perception CNNs: An Image Classification Benchmark Study. 3542-3552 - Jingxing Zhou, Chongzhe Zhang, Jürgen Beyerer:
Towards Weakly-Supervised Domain Adaptation for Lane Detection. 3553-3563 - Lena Heidemann, Iwo Kurzidem
, Maureen Monnet, Karsten Roscher, Stephan Günnemann:
Towards Engineered Safe AI with Modular Concept Models. 3564-3573 - Luca Mossina, Joseba Dalmau, Léo Andéol:
Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive Uncertainty. 3574-3584 - Daniel DeAlcala, Gonzalo Mancera, Aythami Morales
, Julian Fiérrez, Ruben Tolosana
, Javier Ortega-Garcia:
A Comprehensive Analysis of Factors Impacting Membership Inference. 3585-3593 - Sujan Sai Gannamaneni, Frederic Klein, Michael Mock, Maram Akila:
Exploiting CLIP Self-Consistency to Automate Image Augmentation for Safety Critical Scenarios. 3594-3604 - James Seale Smith, Lazar Valkov, Shaunak Halbe, Vyshnavi Gutta, Rogério Feris, Zsolt Kira, Leonid Karlinsky:
Adaptive Memory Replay for Continual Learning. 3605-3615 - Robin Schön, Julian Lorenz, Katja Ludwig, Rainer Lienhart:
Adapting the Segment Anything Model During Usage in Novel Situations. 3616-3626 - Shiyao Li, Wenming Yang, Qingmin Liao:
PMAFusion: Projection-Based Multi-Modal Alignment for 3D Semantic Occupancy Prediction. 3627-3634 - Haoxiang Wang, Pavan Kumar Anasosalu Vasu, Fartash Faghri, Raviteja Vemulapalli, Mehrdad Farajtabar, Sachin Mehta, Mohammad Rastegari, Oncel Tuzel, Hadi Pouransari:
SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding. 3635-3647 - Piotr Kluska, Adrián Castelló, Florian Scheidegger, A. Cristiano I. Malossi, Enrique S. Quintana-Ortí:
QAttn: Efficient GPU Kernels for mixed-precision Vision Transformers. 3648-3657 - Xin Yuan, Hongliang Fei, Jinoo Baek:
Efficient Transformer Adaptation with Soft Token Merging. 3658-3668 - Onur Can Koyun, Behçet Ugur Töreyin:
HaLViT: Half of the Weights are Enough. 3669-3678 - Reza Akbarian Bafghi, Nidhin Harilal, Claire Monteleoni, Maziar Raissi:
Parameter Efficient Fine-tuning of Self-supervised ViTs without Catastrophic Forgetting. 3679-3684 - Yuning Huang, M. A Hassan, Jiangpeng He, Janine A. Higgins, Megan A. McCrory, Heather A. Eicher-Miller, J. Graham Thomas, Edward Sazonov, Fengqing Zhu:
Automatic Recognition of Food Ingestion Environment from the AIM-2 Wearable Sensor. 3685-3694 - Justin Yang, Zhihao Duan, Jiangpeng He, Fengqing Zhu:
Learning to Classify New Foods Incrementally Via Compressed Exemplars. 3695-3704 - Ke-Lei Wang, Pin-Hsuan Chou, Young-Ching Chou, Chia-Jen Liu, Cheng-Kuan Lin, Yu-Chee Tseng:
MP-PolarMask: A Faster and Finer Instance Segmentation for Concave Images. 3705-3714 - Saeed S. Alahmari, Michael Gardner, Tawfiq Salem:
Segment Anything in Food Images. 3715-3720 - Guangzong Chen, Zhi-Hong Mao, Mingui Sun, Kangni Liu, Wenyan Jia:
Shape-Preserving Generation of Food Images for Automatic Dietary Assessment. 3721-3731 - Philip Wootaek Shin, Ajay Narayanan Sridhar, Jack Sampson, Vijaykrishnan Narayanan:
A Generative Exploration of Cuisine Transfer. 3732-3740 - Gautham Vinod, Jiangpeng He, Zeman Shao, Fengqing Zhu:
Food Portion Estimation via 3D Object Scaling. 3741-3749 - Jesús M. Rodríguez-de-Vera, Imanol G. Estepa, Marc Bolaños, Bhalaji Nagarajan, Petia Radeva:
LOFI: LOng-tailed FIne-Grained Network for Food Recognition. 3750-3760 - Aaryam Sharma, Chris Czarnecki, Yuhao Chen, Pengcheng Xi, Linlin Xu, Alexander Wong:
How Much You Ate? Food Portion Estimation on Spoons. 3761-3770 - Romeo Lanzino, Federico Fontana, Anxhelo Diko, Marco Raoul Marini, Luigi Cinque:
Faster Than Lies: Real-time Deepfake Detection using Binary Neural Networks. 3771-3780 - Aashish Chandra K, Aashutosh A V, Srijan Das, Abhijit Das:
Latent Flow Diffusion for Deepfake Video Generation. 3781-3790 - Akshay Agarwal, Nalini K. Ratha:
Deepfake Catcher: Can a Simple Fusion be Effective and Outperform Complex DNNs? 3791-3801 - Raphael Antonius Frick, Martin Steinebach:
DiffSeg: Towards Detecting Diffusion-Based Inpainting Attacks Using Multi-Feature Segmentation. 3802-3808 - Alvaro Lopez Pellcier, Yi Li, Plamen Angelov:
PUDD: Towards Robust Multi-modal Prototype-based Deepfake Detection. 3809-3817 - Roberto Leyva, Victor Sanchez, Gregory Epiphaniou, Carsten Maple:
Demographic Bias Effects on Face Image Synthesis. 3818-3826 - Andrea Panzino
, Simone Maurizio La Cava, Giulia Orrù, Gian Luca Marcialis
:
Evaluating the Integration of Morph Attack Detection in Automated Face Recognition Systems. 3827-3836 - Andrea Ciamarra
, Roberto Caldelli, Alberto Del Bimbo:
Temporal surface frame anomalies for deepfake video detection. 3837-3844 - Sara Concas, Simone Maurizio La Cava, Roberto Casula, Giulia Orrù, Giovanni Puglisi, Gian Luca Marcialis
:
Quality-based Artifact Modeling for Facial Deepfake Detection in Videos. 3845-3854 - Yanhao Li
, Quentin Bammey, Marina Gardella, Tina Nikoukhah, Jean-Michel Morel
, Miguel Colom, Rafael Grompone von Gioi:
MaskSim: Detection of synthetic images by masked spectrum similarity analysis. 3855-3865 - Blaz Rolih
, Dick Ameln, Ashwin Vaidya, Samet Akcay:
Divide and Conquer: High-Resolution Industrial Anomaly Detection via Memory Efficient Tiled Ensemble. 3866-3875 - Christian Benz, Volker Rodehorst:
Omni-Crack30k: A Benchmark for Crack Segmentation and the Reasonable Effectiveness of Transfer Learning. 3876-3886 - Ayush K. Rai, Tarun Krishna, Feiyan Hu
, Alexandru Drimbarean, Kevin McGuinness, Alan F. Smeaton, Noel E. O'Connor:
Video Anomaly Detection via Spatio-Temporal Pseudo-Anomaly Generation : A Unified Approach. 3887-3899 - Andrei-Timotei Ardelean, Tim Weyrich
:
Blind Localization and Clustering of Anomalies in Textures. 3900-3909 - Alex Costanzino
, Pierluigi Zama Ramirez
, Mirko Del Moro, Agostino Aiezzo, Giuseppe Lisanti, Samuele Salti, Luigi Di Stefano:
Test Time Training for Industrial Anomaly Segmentation. 3910-3920 - Ho-Weng Lee, Shang-Hong Lai:
TAB: Text-Align Anomaly Backbone Model for Industrial Inspection Tasks. 3921-3929 - Hansen Wijanarko, Evelyne Calista, Li-Fen Chen, Yong-Sheng Chen:
Tri-VAE: Triplet Variational Autoencoder for Unsupervised Anomaly Detection in Brain Tumor MRI. 3930-3939 - Justin Tebbe, Jawad Tayyub:
Dynamic Addition of Noise in a Diffusion Model for Anomaly Detection. 3940-3949 - Mathis Kruse
, Marco Rudolph, Dominik Woiwode, Bodo Rosenhahn:
SplatPose & Detect: Pose-Agnostic 3D Anomaly Detection. 3950-3960 - Demetris Lappas, Vasileios Argyriou, Dimitrios Makris:
Dynamic Distinction Learning: Adaptive Pseudo Anomalies for Video Anomaly Detection. 3961-3970 - Laurens E. Hogeweg, Rajesh Gangireddy, Django Brunink, Vincent J. Kalkman, Ludo Cornelissen, Jacob W. Kamminga:
COOD: Combined out-of-distribution detection using multiple measures for anomaly & novel class detection in large-scale hierarchical classification. 3971-3980 - Aitor Artola, Yannis Kolodziej, Jean-Michel Morel
, Thibaud Ehret:
Model-guided contrastive fine-tuning for industrial anomaly detection. 3981-3991 - Ashish Singh, Michael J. Jones, Erik G. Learned-Miller:
Tracklet-based Explainable Video Anomaly Localization. 3992-4001 - Zhengye Yang, Richard J. Radke
:
Context-aware Video Anomaly Detection in Long-Term Datasets. 4002-4011 - Fahimeh Fooladgar, Minh Nguyen Nhat To, Parvin Mousavi, Purang Abolmaesumi:
Manifold DivideMix: A Semi-Supervised Contrastive Learning Framework for Severe Label Noise. 4012-4021 - Ying Zhao
:
LogicAL: Towards logical anomaly synthesis for unsupervised anomaly localization. 4022-4031 - Dasol Choi, Dongbin Na:
DMR: Disentangling Marginal Representations for Out-of-Distribution Detection. 4032-4041 - Jinan Bao, Hanshi Sun, Hanqiu Deng, Yinsheng He, Zhaoxiang Zhang, Xingyu Li:
BMAD: Benchmarks for Medical Anomaly Detection. 4042-4053 - Siddeshwar Raghavan
, Jiangpeng He, Fengqing Zhu:
DELTA: Decoupling Long-Tailed Online Continual Learning. 4054-4064 - Nikola Bugarin, Jovana Bugaric, Manuel Barusco, Davide Dalle Pezze, Gian Antonio Susto:
Unveiling the Anomalies in an Ever-Changing World: A Benchmark for Pixel-Level Anomaly Detection in Continual Learning. 4065-4074 - Dipam Goswami, Bartlomiej Twardowski, Joost van de Weijer:
Calibrating Higher-Order Statistics for Few-Shot Class-Incremental Learning with Pre-trained Vision Transformers. 4075-4084 - Vivek Chavan, Paul Koch, Marian Schlüter, Clemens Briese, Jörg Krüger:
Active Data Collection and Management for Real-World Continual Learning via Pretrained Oracle. 4085-4096 - Lukasz Korycki, Bartosz Krawczyk:
Class-Incremental Mixture of Gaussians for Deep Continual Learning. 4097-4106 - Eden Belouadah, Arnaud Dapogny, Kevin Bailly:
MultIOD: Rehearsal-free Multihead Incremental Object Detector. 4107-4117 - Vaibhav Singh, Anna Choromanska, Shuang Li, Yilun Du:
Wake-Sleep Energy Based Models for Continual Learning. 4118-4127 - Nourhan Bayasi, Ghassan Hamarneh, Rafeef Garbi:
Continual-Zoo: Leveraging Zoo Models for Continual Classification of Medical Images. 4128-4138 - Haoran Zhu, Maryam Majzoubi, Arihant Jain, Anna Choromanska:
TAME: Task Agnostic Continual Learning using Multiple Experts. 4139-4148 - Vuong D. Nguyen, Samiha Mirza, Abdollah Zakeri, Ayush Gupta, Khadija Khaldi, Rahma Aloui, Pranav Mantini, Shishir K. Shah, Fatima A. Merchant:
Tackling Domain Shifts in Person Re-Identification: A Survey and Analysis. 4149-4159 - Lanpei Li, Elia Piccoli
, Andrea Cossu
, Davide Bacciu, Vincenzo Lomonaco:
Calibration of Continual Learning Models. 4160-4169 - Junsu Kim, Yunhoe Ku, Jihyeon Kim, Junuk Cha, Seungryul Baek:
VLM-PL: Advanced Pseudo Labeling approach for Class Incremental Object Detection via Vision-Language Model. 4170-4181 - Sandesh Kamath, Albin Soutif-Cormerais, Joost van de Weijer, Bogdan Raducanu:
The Expanding Scope of the Stability Gap: Unveiling its Presence in Joint Incremental Learning of Homogeneous Tasks. 4182-4186 - Jedrzej Kozal, Jan Wasilewski, Bartosz Krawczyk, Michal Wozniak
:
Continual Learning with Weight Interpolation. 4187-4195 - Alexander Krawczyk, Alexander Gepperth:
An analysis of best-practice strategies for replay and rehearsal in continual learning. 4196-4204 - Xin Gao, Xin Yang, Hao Yu, Yan Kang, Tianrui Li:
FedProK: Trustworthy Federated Class-Incremental Learning via Prototypical Feature Knowledge Transfer. 4205-4214 - Mattia Dutto, Gabriele Moreno Berton, Debora Caldarola, Eros Fanì, Gabriele Trivigno, Carlo Masone
:
Collaborative Visual Place Recognition through Federated Learning. 4215-4225 - Nawrin Tabassum
, Ka-Ho Chow, Xuyu Wang
, Wenbin Zhang
, Yanzhao Wu
:
On the Efficiency of Privacy Attacks in Federated Learning. 4226-4235 - Krishna Kanth Nakka, Ahmed Frikha, Ricardo Mendes, Xue Jiang, Xuebing Zhou:
Federated Hyperparameter Optimization through Reward-Based Strategies: Challenges and Insights. 4236-4244 - Johan Edstedt, Georg Bökman, Zhenjun Zhao:
DeDoDe v2: Analyzing and Improving the DeDoDe Keypoint Detector. 4245-4253 - Hongkai Chen, Zixin Luo, Yurun Tian, Xuyang Bai, Ziyu Wang, Lei Zhou, Mingmin Zhen, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan:
Affine-based Deformable Attention and Selective Fusion for Semi-dense Matching. 4254-4263 - Gabriele Moreno Berton, Gabriele Goletto, Gabriele Trivigno, Alex Stoken
, Barbara Caputo, Carlo Masone
:
EarthMatch: Iterative Coregistration for Fine-grained Localization of Astronaut Photography. 4264-4274 - Önder Tuzcuoglu, Aybora Köksal
, Bugra Sofu, Sinan Kalkan, A. Aydin Alatan:
XoFTR: Cross-modal Feature Matching Transformer. 4275-4286 - Amulya Pendota, Sumohana S. Channappayya:
Are Deep Learning Models Pre-trained on RGB Data Good Enough for RGB-Thermal Image Retrieval? 4287-4296 - Gonzalo J. Aniano Porcile, Jack Gindi, Shivansh Mundra, James R. Verbus, Hany Farid:
Finding AI-Generated Faces in the Wild. 4297-4305 - Justin Norman, Hany Farid:
An Investigation into the Impact of AI-Powered Image Enhancement on Forensic Facial Recognition. 4306-4314 - Matyas Bohacek
, Hany Farid:
Lost in Translation: Lip-Sync Deepfake Detection from Audio-Video Mismatch. 4315-4323 - Shan Jia, Reilin Lyu, Kangran Zhao, Yize Chen, Zhiyuan Yan, Yan Ju, Chuanbo Hu, Xin Li
, Baoyuan Wu, Siwei Lyu:
Can ChatGPT Detect DeepFakes? A Study of Using Multimodal Large Language Models for Media Forensics. 4324-4333 - Aref Azizpour, Tai D. Nguyen, Manil Shrestha, Kaidi Xu, Edward Kim, Matthew C. Stamm
:
E3: Ensemble of Expert Embedders for Adapting Synthetic Image Detectors to New Generators Using Limited Data. 4334-4344 - Dimitrios Karageorgiou, Giorgos Kordopatis-Zilos, Symeon Papadopoulos:
Fusion Transformer with Object Mask Guidance for Image Forgery Analysis. 4345-4355 - Davide Cozzolino, Giovanni Poggi, Riccardo Corvi, Matthias Nießner, Luisa Verdoliva:
Raising the Bar of AI-generated Image Detection with CLIP. 4356-4366 - Farhad Shadmand, Iurii Medvedev, Luiz Schirmer, João Marcos, Nuno Gonçalves:
StampOne: Addressing Frequency Balance in Printer-proof Steganography. 4367-4376 - Jun Myeong Choi, Johnathan Leung
, Noah Frahm, Max Christman, Gedas Bertasius, Roni Sengupta:
Building Secure and Engaging Video Communication by Using Monitor Illumination. 4377-4386 - Milica Gerhardt, Luca Cuccovillo, Patrick Aichroth:
Audio Provenance Analysis in Heterogeneous Media Sets. 4387-4396 - Danial Samadi Vahdati, Tai D. Nguyen, Aref Azizpour, Matthew C. Stamm
:
Beyond Deepfake Images: Detecting AI-Generated Videos. 4397-4408 - Luca Cuccovillo, Milica Gerhardt, Patrick Aichroth:
Audio Transformer for Synthetic Speech Detection via Multi-Formant Analysis. 4409-4417 - Amit Kumar Singh Yadav, Kratika Bhagtani, Davide Salvi, Paolo Bestagini, Edward J. Delp:
FairSSD: Understanding Bias in Synthetic Speech Detectors. 4418-4428 - Razaib Tariq, Minji Heo, Simon S. Woo, Shahroz Tariq:
Beyond the Screen: Evaluating Deepfake Detectors under Moiré Pattern Effects. 4429-4439 - Alexandra Dana, Nadav Carmel, Amit Shomer, Ofer Manela, Tomer Peleg:
Do More With What You Have: Transferring Depth-Scale from Labeled to Unlabeled Domains. 4440-4450 - Loveneet Saini, Yu Su, Hasan Tercan, Tobias Meisen:
CenterPoint Transformer for BEV Object Detection with Automotive Radar. 4451-4460 - Carl Lindström
, Georg Hess, Adam Lilja, Maryam Fatemi, Lars Hammarstrand, Christoffer Petersson, Lennart Svensson:
Are NeRFs ready for autonomous driving? Towards closing the real-to-simulation gap. 4461-4471 - Benoît Gérin, Anaïs Halin, Anthony Cioppa, Maxim Henry, Bernard Ghanem
, Benoît Macq, Christophe De Vleeschouwer, Marc Van Droogenbroeck:
Multi-Stream Cellular Test-Time Adaptation of Real-Time Models Evolving in Dynamic Environments. 4472-4482 - Kuan-Lin Wang, Li-Wu Tsao, Jhih-Ciang Wu
, Hong-Han Shuai, Wen-Huang Cheng:
TrajFine: Predicted Trajectory Refinement for Pedestrian Trajectory Forecasting. 4483-4492 - Sophia Sirko-Galouchenko, Alexandre Boulch, Spyros Gidaris, Andrei Bursuc, Antonín Vobecký, Patrick Pérez, Renaud Marlet:
OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks. 4493-4503 - Kota Shimomura, Tsubasa Hirakawa, Takayoshi Yamashita, Hironobu Fujiyoshi:
Potential Risk Localization via Weak Labeling out of Blind Spot. 4504-4513 - Nitin Kumar Saravana Kannan, Matthias Reuse, Martin Simon:
Click, Crop & Detect: One-Click Offline Annotation for Human-in-the-Loop 3D Object Detection on Point Clouds. 4514-4525 - James Gunn, Zygmunt Lenyk, Anuj Sharma, Andrea Donati, Alexandru Buburuzan, John Redford, Romain Mueller:
Lift-Attend-Splat: Bird's-eye-view camera-lidar fusion using transformers. 4526-4536 - Liang Shi, Yixin Chen, Meimei Liu, Feng Guo:
DuST: Dual Swin Transformer for Multi-modal Video and Time-Series Modeling. 4537-4546 - Rong Li, Shijie Li, Xieyuanli Chen, Teli Ma, Juergen Gall, Junwei Liang:
TFNet: Exploiting Temporal Cues for Fast and Accurate LiDAR Semantic Segmentation. 4547-4556 - Eunjin Son, Sang Jun Lee:
CaBins: CLIP-based Adaptive Bins for Monocular Depth Estimation. 4557-4567 - Samuel M. Bateman, Ning Xu, H. Charles Zhao, Yael Ben Shalom, Vince Gong, Greg Long, Will Maddern:
Exploring Real World Map Change Generalization of Prior-Informed HD Map Prediction Models. 4568-4578 - Mathieu Cocheteux
, Julien Moreau, Franck Davoine:
MULi-Ev: Maintaining Unperturbed LiDAR-Event Calibration. 4579-4586 - Dimitrios Kollias, Panagiotis Tzirakis, Alan Cowen, Stefanos Zafeiriou, Irene Kotsia, Alice Baird, Chris Gagne, Chunchang Shao, Guanyu Hu:
The 6th Affective Behavior Analysis in-the-wild (ABAW) Competition. 4587-4598 - Peter Hardy, Hansung Kim:
Unsupervised Multi-Person 3D Human Pose Estimation From 2D Poses Alone. 4599-4603 - Marah Halawa, Florian Blume, Pia Bideau, Martin Maier, Rasha Abdel Rahman, Olaf Hellwich:
Multi-Task Multi-Modal Self-Supervised Learning for Facial Expression Recognition. 4604-4614 - SangHwa Hong:
Purposeful Regularization with Reinforcement Learning for Facial Expression Recognition In-the-Wild. 4615-4624 - Paul Waligora, Muhammad Haseeb Aslam, Muhammad Osama Zeeshan, Soufiane Belharbi, Alessandro Lameiras Koerich, Marco Pedersoli, Simon Bacon, Eric Granger:
Joint Multimodal Transformer for Emotion Recognition in the Wild. 4625-4635 - Chi-Hsuan Wu, Shih-Yang Liu, Xijie Huang, Xingbo Wang, Rong Zhang, Luca Minciullo, Wong Kai Yiu, Kenny Kwan, Kwang-Ting Cheng:
CMOSE: Comprehensive Multi-Modality Online Student Engagement Dataset with High-Quality Labels. 4636-4645 - Filipa Lino
, Carlos Santiago, Manuel Marques
:
3D Human Pose Estimation with Occlusions: Introducing BlendMimic3D Dataset and GCN Refinement. 4646-4656 - Tobias Hallmen, Fabian Deuser, Norbert Oswald, Elisabeth André:
Unimodal Multi-Task Fusion for Emotional Mimicry Intensity Prediction. 4657-4665 - Weiwei Zhou, Jiada Lu, Chenkun Ling, Weifeng Wang, Shaowei Liu:
Enhancing Emotion Recognition with Pre-trained Masked Autoencoders and Sequential Learning. 4666-4672 - Kateryna Chumachenko, Alexandros Iosifidis
, Moncef Gabbouj
:
MMA-DFER: MultiModal Adaptation of unimodal models for Dynamic Facial Expression Recognition in-the-wild. 4673-4682 - Niklas Wagner, Felix Mätzler, Samed Rouven Vossberg, Helen Schneider, Svetlana Pavlitska, J. Marius Zöllner:
CAGE: Circumplex Affect Guided Expression Inference. 4683-4692 - Valeriya Strizhkova, Laura M. Ferrari, Hadi Kachmar, Antitza Dantcheva, François Brémond:
Video Representation Learning for Conversational Facial Expression Recognition Guided by Multiple View Reconstruction. 4693-4702 - Andrey V. Savchenko:
Leveraging Pre-trained Multi-task Deep Models for Trustworthy Facial Analysis in Affective Behaviour Analysis in-the-Wild. 4703-4712 - Mustaqeem Khan, Jamil Ahmad, Abdulmotaleb El-Saddik, Wail Gueaieb, Giulia De Masi, Fakhri Karray:
Drone-HAT: Hybrid Attention Transformer for Complex Action Recognition in Drone Surveillance Videos. 4713-4722 - Alexander Vedernikov, Puneet Kumar, Haoyu Chen, Tapio Seppänen, Xiaobai Li:
TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals. 4723-4732 - Feng Qiu, Heming Du, Wei Zhang, Chen Liu, Lincheng Li, Tianchen Guo, Xin Yu
:
Learning Transferable Compound Expressions from Masked AutoEncoder Pretraining. 4733-4741 - Feng Qiu, Wei Zhang, Chen Liu, Lincheng Li, Heming Du, Tianchen Guo, Xin Yu
:
Language-guided Multi-modal Emotional Mimicry Intensity Estimation. 4742-4751 - Elena Ryumina, Maxim Markitantov, Dmitry Ryumin, Heysem Kaya, Alexey Karpov:
Zero-Shot Audio-Visual Compound Expression Recognition Method based on Emotion Probability Fusion. 4752-4760 - Wei Zhang, Feng Qiu, Chen Liu, Lincheng Li, Heming Du, Tianchen Guo, Xin Yu
:
An Effective Ensemble Learning Framework for Affective Behaviour Analysis. 4761-4772 - Denis Dresvyanskiy, Maxim Markitantov, Jiawei Yu, Heysem Kaya, Alexey Karpov:
Multi-modal Arousal and Valence Estimation under Noisy Conditions. 4773-4783 - Xuan-Bach Nguyen, Hoang-Thien Nguyen, Thanh-Huy Nguyen, Nhu-Tai Do, Quang Vinh Dinh:
Emotic Masked Autoencoder on Dual-views with Attention Fusion for Facial Expression Recognition. 4784-4792 - Qiang Zhang, Tong Xiao, Haroun Habeeb, Larissa Laich, Sofien Bouaziz, Patrick Snape, Wenjing Zhang, Matthew Cioffi, Peizhao Zhang, Pavel Pidlypenskyi, Winnie Lin, Luming Ma, Mengjiao Wang, Kunpeng Li, Chengjiang Long, Steven Song, Martin Prazák, Alexander Sjoholm, Ajinkya Deogade, Jaebong Lee, Julio Delgado Mangas, Amaury Aubel:
REFA: Real-time Egocentric Facial Animations for Virtual Reality. 4793-4802 - R. Gnana Praveen
, Jahangir Alam:
Recursive Joint Cross-Modal Attention for Multimodal Fusion in Dimensional Emotion Recognition. 4803-4813 - Jun Yu, Zerui Zhang, Zhihong Wei, Gongpeng Zhao, Zhongpeng Cai, Yongqi Wang, Guochen Xie, Jichao Zhu, Wangyuan Zhu, Qingsong Liu, Jiaen Liang:
AUD-TGN: Advancing Action Unit Detection with Temporal Convolution and GPT-2 in Wild Audiovisual Contexts. 4814-4821 - Ankith Jain Rakesh Kumar, Bir Bhanu:
Uncovering Hidden Emotions with Adaptive Multi-Attention Graph Networks. 4822-4831 - Shanle Yao, Ghazal Alinezhad Noghre, Armin Danesh Pazho, Hamed Tabkhi:
Evaluating the Effectiveness of Video Anomaly Detection in the Wild Online Learning and Inference for Real-world Deployment. 4832-4841 - Hruturaj Dhake, Akshay Agarwal:
Unravelling Robustness of Deep Face Recognition Networks Against Illicit Drug Abuse Images. 4842-4848 - Andrey V. Savchenko, Anna P. Sidorova:
EmotiEffNet and Temporal Convolutional Networks in Video-based Facial Expression Recognition and Action Unit Detection. 4849-4859 - Seongjae Min, Junseok Yang, Sejoon Lim:
Emotion Recognition Using Transformers with Random Masking. 4860-4865 - Jun Yu, Wangyuan Zhu, Jichao Zhu, Zhongpeng Cai, Gongpeng Zhao, Zerui Zhang, Guochen Xie, Zhihong Wei, Qingsong Liu, Jiaen Liang:
Efficient Feature Extraction and Late Fusion Strategy for Audiovisual Emotional Mimicry Intensity Estimation. 4866-4872 - Jun Yu, Jichao Zhu, Wangyuan Zhu, Zhongpeng Cai, Gongpeng Zhao, Zhihong Wei, Guochen Xie, Zerui Zhang, Qingsong Liu, Jiaen Liang:
Multi Model Ensemble for Compound Expression Recognition. 4873-4879 - Jun Yu, Zhihong Wei, Zhongpeng Cai, Gongpeng Zhao, Zerui Zhang, Yongqi Wang, Guochen Xie, Jichao Zhu, Wangyuan Zhu, Qingsong Liu, Jiaen Liang:
Exploring Facial Expression Recognition through Semi-Supervised Pre-training and Temporal Modeling. 4880-4887 - Damith Chamalke Senadeera, Xiaoyun Yang, Dimitrios Kollias, Gregory G. Slabaugh:
CUE-Net: Violence Detection Video Analytics with Spatial Cropping, Enhanced UniformerV2 and Modified Efficient Additive Attention. 4888-4897 - Yihao Zhao, Cuiyun Yuan, Ying Liang, Yang Li, Chunxia Li, Man Zhao, Jun Hu, Ningze Zhong, Chenbin Liu:
One class classification-based quality assurance of organs-at-risk delineation in radiotherapy. 4898-4906 - Dimitrios Kollias, Anastasios Arsenos, Stefanos Kollias:
Domain adaptation, Explainability & Fairness in AI for Medical Image Analysis: Diagnosis of COVID-19 based on 3-D Chest CT-scans. 4907-4914 - Soroosh Safari Loaliyan, Greg Ver Steeg:
Comparative Analysis of Generalization and Harmonization Methods for 3D Brain fMRI Images: A Case Study on OpenBHB Dataset. 4915-4923 - Chih-Chung Hsu
, Chia-Ming Lee, Yang Fan Chiang, Yi-Shiuan Chou, Chih-Yu Jiang, Shen-Chieh Tai, Chi-Han Tsai:
A Closer Look at Spatial-Slice Features Learning for COVID-19 Detection. 4924-4934 - Stuti Pandey, Josh Myers-Dean, Jarek Reynolds, Danna Gurari:
Interpreting COVID Lateral Flow Tests' Results with Foundation Models. 4935-4942 - Vuong D. Nguyen:
Fetal ECG Extraction on Time-Frequency Domain using Conditional GAN. 4943-4949 - Wenjin Zhang, Keyi Li, Sen Yang, Sifan Yuan, Ivan Marsic, Genevieve J. Sippel, Mary S. Kim, Randall S. Burd:
Focusing on What Matters: Fine-grained Medical Activity Recognition for Trauma Resuscitation via Actor Tracking. 4950-4958 - Cecilia Diana-Albelda, Roberto Alcover-Couso, Álvaro García-Martín, Jesús Bescós:
How SAM Perceives Different mp-MRI Brain Tumor Domains? 4959-4970 - Tiancheng Gu, Kaicheng Yang, Dongnan Liu, Weidong Cai:
LaPA: Latent Prompt Assist Model for Medical Visual Question Answering. 4971-4980 - Shehan Perera, Pouyan Navard, Alper Yilmaz:
SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation. 4981-4988 - Md Mostafijur Rahman, Mustafa Munir, Debesh Jha, Ulas Bagci, Radu Marculescu:
PP-SAM: Perturbed Prompts for Robust Adaption of Segment Anything Model for Polyp Segmentation. 4989-4995 - Miguel Cardoso, Carlos Santiago, Jacinto C. Nascimento:
Using Counterfactual Information for Breast Classification Diagnosis. 4996-5002 - Julia Yang, Alina Jade Barnett, Jon Donnelly, Satvik Kishore, Jerry Fang, Fides Regina Schwartz, Chaofan Chen, Joseph Y. Lo, Cynthia Rudin:
FPN-IAIA-BL: A Multi-Scale Interpretable Deep Learning Model for Classification of Mass Margins in Digital Mammography. 5003-5009 - Xingguang Zhang, Chih-Hsien Chou:
Source-free Domain Adaptation for Video Object Detection Under Adverse Image Conditions. 5010-5019 - Nikoo Dehghani, Ayla Thijssen
, Quirine E. W. van der Zander, Ramon-Michel Schreuder, Erik J. Schoon, Fons van der Sommen, Peter H. N. de With:
Evaluating Confidence Calibration in Endoscopic Diagnosis Models. 5020-5025 - David Anglada-Rotger, Julia Sala, Ferran Marqués, Philippe Salembier, Montse Pardàs:
Enhancing Ki-67 Cell Segmentation with Dual U-Net Models: A Step Towards Uncertainty-Informed Active Learning. 5026-5035 - Nikolaos Spanos, Anastasios Arsenos, Paraskevi-Antonia Theofilou, Paraskevi K. Tzouveli, Athanasios Voulodimos, Stefanos D. Kollias:
Complex Style Image Transformations for Domain Generalization in Medical Images. 5036-5045 - Haoyu Dong, Nicholas Konz, Hanxue Gu, Maciej A. Mazurowski:
Medical Image Segmentation with InTEnt: Integrated Entropy Weighting for Single Image Test-Time Adaptation. 5046-5055 - Mohana Singh, Vivek B. S., Jayavardhana Gubbi, Arpan Pal:
Prototype-based Interpretable Model for Glaucoma Detection. 5056-5065 - Oscar Pina, Verónica Vilaplana
:
Unsupervised Domain Adaptation for Multi-Stain Cell Detection in Breast Cancer with Transformers. 5066-5074 - Md Abdur Rahaman, Zening Fu, Armin Iraji, Vince D. Calhoun:
A Deep Biclustering Framework for Brain Network Analysis. 5075-5085 - Zhixin Lai, Jing Wu, Suiyao Chen, Yucheng Zhou
, Naira Hovakimyan:
Residual-based Language Models are Free Boosters for Biomedical Imaging Tasks. 5086-5096 - Hong Nguyen, Hoang Nguyen, Melinda Chang, Hieu Pham, Shrikanth Narayanan, Michael Pazzani:
ConPro: Learning Severity Representation for Medical Images using Contrastive Learning and Preference Optimization. 5105-5112 - Vazgen Zohranyan, Vagner Navasardyan, Hayk Navasardyan, Jan Borggrefe, Shant Navasardyan:
Dr-SAM: An End-to-End Framework for Vascular Segmentation, Diameter Estimation, and Anomaly Detection on Angiography Images. 5113-5121 - Ruby Wood, Enric Domingo, Viktor Hendrik Koelzer, Timothy S. Maughan
, Jens Rittscher:
Cluster Triplet Loss for Unsupervised Domain Adaptation on Histology Images. 5122-5131 - Carlos Hernández-Pérez, Lauren Jimenez-Martin, Verónica Vilaplana
:
Bridging Domains in Melanoma Diagnostics: Predicting BRAF Mutations and Sentinel Lymph Node Positivity with Attention-Based Models in Histological Images. 5132-5140 - Runtian Yuan, Qingqiu Li, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chen:
Domain Adaptation Using Pseudo Labels for COVID-19 Detection. 5141-5148 - Qingqiu Li, Runtian Yuan, Junlin Hou, Jilan Xu, Yuejie Zhang, Rui Feng, Hao Chen:
Advancing COVID-19 Detection in 3D CT Scans. 5149-5156 - Janet Wang, Yunbei Zhang, Zhengming Ding, Jihun Hamm:
Achieving Reliable and Fair Skin Lesion Diagnosis via Unsupervised Domain Adaptation. 5157-5166 - Jack Ellis, Kofi Appiah, Emmanuel Amankwaa-Frempong, Sze Chai Kwok:
Classification of 2D Ultrasound Breast Cancer Images with Deep Learning. 5167-5173 - Kishore Kumar M, Sriprabha Ramanarayanan, Sadhana S, Arunima Sarkar, Matcha Naga Gayathri, Keerthi Ram, Mohanasankar Sivaprakasam:
DCE-diff: Diffusion Model for Synthesis of Early and Late Dynamic Contrast-Enhanced MR Images from Non-Contrast Multimodal Inputs. 5174-5183 - Sidra Aleem, Fangyijie Wang
, Mayug Maniparambil
, Eric Arazo, Julia Dietlmeier, Kathleen M. Curran
, Noel E. O'Connor, Suzanne Little:
Test-Time Adaptation with SaLIP: A Cascade of SAM and CLIP for Zero-shot Medical Image Segmentation. 5184-5193 - Weiyi Zhang, Danli Shi, Mingguang He:
Improving Consistency in Cardiovascular Disease Risk Assessment: Cross-Camera Adaptation for Retinal Images. 5194-5199 - Ramy Farag, Parth Upadhay, Jacket Demby's, Yixiang Gao, Katherin Garces Montoya, Seyed Mohamad Ali Tousi, Gbenga Omotara, Guilherme N. DeSouza:
EfficientNet-SAM: A Novel EffecientNet with Spatial Attention Mechanism for COVID-19 Detection in Pulmonary CT Scans. 5200-5206 - Durga Supriya HL, Swetha Mary Thomas, Sowmya Kamath S:
A Multimodal Approach Integrating Convolutional and Recurrent Neural Networks for Alzheimer's Disease Temporal Progression Prediction. 5207-5215 - Robert Turnbull, Simon J. Mutch
:
Separating lungs in CT scans for improved COVID19 detection. 5216-5222 - Thanh-Huy Nguyen, Thi Kim Ngan Ngo, Mai Anh Vu, Ting-Yuan Tu:
Blurry-Consistency Segmentation Framework with Selective Stacking on Differential Interference Contrast 3D Breast Cancer Spheroid. 5223-5230 - D. J. Araújo, Maria Rita Verdelho
, Alceu Bissoto, J. C. Nascimento, Carlos Santiago, Catarina Barata:
Key Patches Are All You Need: A Multiple Instance Learning Framework For Robust Medical Diagnosis. 5231-5240 - Adrit Rao, Andrea Fisher
, Ken Chang, John Christopher Panagides, Katherine McNamara, Joon-Young Lee, Oliver O. Aalami:
IMIL: Interactive Medical Image Learning Framework. 5241-5250 - Zong-Wei Hong, Yen-Yang Hung, Chu-Song Chen:
RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images. 5251-5260 - Ramesh Ashok Tabib, Dikshit Hegde, Uma Mudenagudi:
LGAfford-Net: A Local Geometry Aware Affordance Detection Network for 3D Point Clouds. 5261-5270 - Anushrut Jignasu, Aditya Balu, Soumik Sarkar, Chinmay Hegde
, Baskar Ganapathysubramanian, Adarsh Krishnamurthy:
SDFConnect: Neural Implicit Surface Reconstruction of a Sparse Point Cloud with Topological Constraints. 5271-5279 - Jaemin Cho, Linjie Li, Zhengyuan Yang, Zhe Gan, Lijuan Wang, Mohit Bansal:
Diagnostic Benchmark and Iterative Inpainting for Layout-Guided Image Generation. 5280-5289 - Baiqi Li, Zhiqiu Lin, Deepak Pathak, Jiayao Li, Yixin Fei, Kewen Wu, Xide Xia, Pengchuan Zhang, Graham Neubig, Deva Ramanan:
Evaluating and Improving Compositional Text-to-Visual Generation. 5290-5301 - Pengliang Ji, Junchen Liu:
TlTScore: Towards Long-Tail Effects in Text-to-Visual Evaluation with Generative Foundation Models. 5302-5313 - Aayush Atul Verma, Amir Saeidi, Shamanthak Hegde, Ajay Therala, Fenil Denish Bardoliya, Nagaraju Machavarapu, Shri Ajay Kumar Ravindhiran, Srija Malyala, Agneet Chatterjee, Yezhou Yang, Chitta Baral:
Evaluating Multimodal Large Language Models across Distribution Shifts and Augmentations. 5314-5324 - Pengliang Ji, Chuyang Xiao, Huilin Tai, Mingxiao Huo:
T2VBench: Benchmarking Temporal Dynamics for Text-to-Video Generation. 5325-5335 - Muhammad Hamza Asad, Saeed Anwar, Abdul Bais:
Improved Crop and Weed Detection with Diverse Data Ensemble Learning. 5336-5345 - Jing Wu, Zhixin Lai, Suiyao Chen, Ran Tao, Pan Zhao, Naira Hovakimyan:
The New Agronomists: Language Models are Experts in Crop Management. 5346-5356 - Muhammad Zawish
, Paul Albert, Flavio Esposito, Steven Davy, Lizy Abraham:
Energy-Efficient Uncertainty-Aware Biomass Composition Prediction at the Edge. 5357-5365 - Jonathan Xu, Amna Elmustafa, Liya Weldegebriel, Emnet Negash, Richard Lee, Chenlin Meng, Stefano Ermon, David B. Lobell
:
HarvestNet: A Dataset for Detecting Smallholder Farming Activity Using Harvest Piles and Remote Sensing. 5366-5374 - Zane K. J. Hartley, Rob J. Lind, Michael P. Pound, Andrew P. French:
Domain Targeted Synthetic Plant Style Transfer using Stable Diffusion, LoRA and ControlNet. 5375-5383 - Akshatha Mohan, Joshua Peeples:
Lacunarity Pooling Layers for Plant Image Classification using Texture Analysis. 5384-5392 - Thorsten Cardoen, Sam Leroux, Pieter Simoens:
Label Efficient Lifelong Multi-View Broiler Detection. 5393-5402 - Rana Waqar, Zeljana Grbovic, Maryam Khan, Nina Pajevic, Dimitrije Stefanovic, Vladan Filipovic
, Marko Panic
, Nemanja Djuric:
End-to-End Deep Learning Models for Gap Identification in Maize Fields. 5403-5412 - Gonçalo P. Matos, Carlos Santiago, João Paulo Costeira, Ricardo L. Saldanha, Ernesto M. Morgado:
Tracking and Counting Apples in Orchards Under Intermittent Occlusions and Low Frame Rates. 5413-5421 - Mikolaj Cieslak, Umabharathi Govindarajan, Alejandro Garcia, Anuradha Chandrashekar, Torsten Hädrich, Aleksander Mendoza-Drosik, Dominik L. Michels, Sören Pirk, Chia-Chun Fu, Wojciech Palubicki
:
Generating Diverse Agricultural Data for Vision-Based Farming Applications. 5422-5431 - Maohui Li, Michael Halstead, Chris McCool:
Knowledge Distillation for Efficient Instance Semantic Segmentation with Transformers. 5432-5439 - Sambal Shikhar, Anupam Sobti:
Label-free Anomaly Detection in Aerial Agricultural Images with Masked Image Modeling. 5440-5449 - Simone Angarano, Mauro Martini
, Alessandro Navone
, Marcello Chiaberge:
Domain Generalization for Crop Segmentation with Standardized Ensemble Knowledge Distillation. 5450-5459 - Oishee Bintey Hoque, Samarth Swarup, Abhijin Adiga, Sayjro Kossi Nouwakpo, Madhav V. Marathe:
IrrNet: Advancing Irrigation Mapping with Incremental Patch Size Training on Remote Sensing Imagery. 5460-5469 - Heesup Yun, Sassoum Lo, Christine H. Diepenbrock, Brian N. Bailey
, J. Mason Earles:
VisTA-SR: Improving the Accuracy and Resolution of Low-Cost Thermal Imaging Cameras for Agriculture. 5470-5479 - Xingjian Li, Jeremy Park, Chris Reberg-Horton
, Steven B. Mirsky, Edgar J. Lobaton
, Lirong Xiang
:
Photorealistic Arm Robot Simulation for 3D Plant Reconstruction and Automatic Annotation using Unreal Engine 5. 5480-5488 - Toqi Tahamid Sarker
, Mohamed G. Embaby, Khaled R. Ahmed, Amer AbuGhazaleh:
Gasformer: A Transformer-based Architecture for Segmenting Methane Emissions from Livestock in Optical Gas Imaging. 5489-5497 - Pawel Majewski, Piotr Lampa, Robert Burduk, Jacek Reiner:
End-to-end Solution for Tenebrio Molitor Rearing Monitoring with Uncertainty Estimation and Domain Shift Detection. 5498-5507 - Hanbo Zhang, Jie Xu, Yuchen Mo, Tao Kong:
InViG: Benchmarking Open-Ended Interactive Visual Grounding with 500K Dialogues. 5508-5518 - Haoyang Cheng, Haitao Wen, Heqian Qiu, Lanxiao Wang, Minjian Zhang, Hongliang Li:
Must Unsupervised Continual Learning Relies on Previous Information? 5519-5529 - Heqian Qiu, Lanxiao Wang, Taijin Zhao, Fanman Meng, Hongliang Li:
HumanFormer: Human-centric Prompting Multi-modal Perception Transformer for Referring Crowd Detection. 5530-5540 - Yiming Xiao, Fanman Meng, Qingbo Wu, Linfeng Xu, Mingzhou He, Hongliang Li:
GM-DETR: Generalized Muiltispectral DEtection TRansformer with Efficient Fusion Encoder for Visible-Infrared Detection. 5541-5549 - Jinmeng Wu, Pengcheng Shu, Hanyu Hong, Lei Ma, Ying Zhu, Lei Wang:
Pre-trained Bidirectional Dynamic Memory Network For Long Video Question Answering. 5550-5557 - Xuan Li, Rongfu Chen, Jie Wang, Lei Ma, Li Cheng, Haiwen Yuan:
DSTCFuse: A Method based on Dual-cycled Cross-awareness of Structure Tensor for Semantic Segmentation via Infrared and Visible Image Fusion. 5558-5567 - Yusong Cai, Shimou Ling, Liang Zhang, Lili Pan, Hongliang Li:
Is Our Continual Learner Reliable? Investigating Its Decision Attribution Stability through SHAP Value Consistency. 5568-5575 - Yezhi Shen, Weichen Xu, Qian Lin, Jan P. Allebach, Fengqing Zhu:
GRIB: Combining Global Reception and Inductive Bias For Human Segmentation and Matting. 5576-5585 - Kai Kohyama, Shintaro Shiba, Yoshimitsu Aoki:
3D Human Scan With A Moving Event Camera. 5586-5596 - Tomas Hodan, Martin Sundermeyer, Yann Labbé, Van Nguyen Nguyen, Gu Wang, Eric Brachmann, Bertram Drost, Vincent Lepetit, Carsten Rother, Jiri Matas:
BOP Challenge 2023 on Detection, Segmentation and Pose Estimation of Seen and Unseen Rigid Objects. 5610-5619 - Bang Du, Kunyao Chen, Haochen Zhang, Fei Yin, Baichuan Wu, Truong Nguyen:
Modeling Detailed Human Geometry with Adaptive Local Refinement. 5620-5630 - Seungha Noh, Kangmin Bae, Yuseok Bae, Byong-Dai Lee:
H3Net: Irregular Posture Detection by Understanding Human Character and Core Structures. 5631-5641 - Thanh-Dat Truong, Pierce Helton, Ahmed Moustafa, Jackson David Cothren
, Khoa Luu:
CONDA: Continual Unsupervised Domain Adaptation Learning in Visual Perception for Self-Driving Cars. 5642-5650 - Armin Danesh Pazho, Ghazal Alinezhad Noghre, Vinit Katariya, Hamed Tabkhi:
VT-Former: An Exploratory Study on Vehicle Trajectory Prediction for Highway Surveillance through Graph Isomorphism and Transformer. 5651-5662 - Yujin Tang, Peijie Dong, Zhenheng Tang, Xiaowen Chu, Junwei Liang:
VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal Forecasting. 5663-5673 - Boris Culjak, Nina Pajevic, Vladan Filipovic
, Dimitrije Stefanovic, Zeljana Grbovic, Nemanja Djuric, Marko Panic
:
Exploration of Data Augmentation Techniques for Bush Detection in Blueberry Orchards. 5674-5683 - Pietro Bonazzi, Sizhen Bian, Giovanni Lippolis, Yawei Li, Sadique Sheik, Michele Magno:
Retina : Low-Power Eye Tracking with Event Camera and Spiking Hardware. 5684-5692 - Niloufar Pourian, Alexey Supikov:
Joint Motion Detection in Neural Videos Training. 5693-5700 - Asude Aydin, Mathias Gehrig, Daniel Gehrig, Davide Scaramuzza:
A Hybrid ANN-SNN Architecture for Low-Power and Low-Latency Visual Perception. 5701-5711 - Christoph Reich
, Oliver Hahn, Daniel Cremers, Stefan Roth, Biplob Debnath:
A Perspective on Deep Vision Performance with Standard Image and Video Codecs. 5712-5721 - Yueyu Hu, Onur G. Guleryuz, Philip A. Chou, Danhang Tang, Jonathan Taylor, Rus Maxham, Yao Wang
:
One-Click Upgrade from 2D to 3D: Sandwiched RGB-D Video Compression for Stereoscopic Teleconferencing. 5722-5731 - Christoph Reich
, Biplob Debnath, Deep Patel, Tim Prangemeier, Daniel Cremers, Srimat Chakradhar:
Deep Video Codec Control for Vision Models. 5732-5741 - Jia-Jie Lim, Matthias Sebastian Treder, Aaron Chadha, Yiannis Andreopoulos:
Adaptive Render-Video Streaming for Virtual Environments. 5742-5751 - Yueyu Hu, Ran Gong, Qi Sun, Yao Wang
:
Low Latency Point Cloud Rendering with Learned Splatting. 5752-5761 - Zhong Wang, Zengyu Wan, Han Han, Bohao Liao, Yuliang Wu, Wei Zhai, Yang Cao, Zheng-Jun Zha:
MambaPupil: Bidirectional Selective Recurrent model for Event-based Eye tracking. 5762-5770 - Baoheng Zhang, Yizhao Gao, Jingyuan Li, Hayden Kwok-Hay So:
Co-designing a Sub-millisecond Latency Event-based Eye Tracking System with Submanifold Sparse CNN. 5771-5779 - Yan Ru Pei, Sasskia Brüers, Sébastien M. Crouzet, Douglas McLelland, Olivier Coenen:
A Lightweight Spatiotemporal Network for Online Eye Tracking with Event Camera. 5780-5788 - Xiaopeng Lin, Hongwei Ren, Bojun Cheng:
FAPNet: An Effective Frequency Adaptive Point-based Eye Tracker. 5789-5798 - Chenlong He
, Qi Zheng
, Ruoxi Zhu
, Xiaoyang Zeng, Yibo Fan, Zhengzhong Tu:
COVER: A Comprehensive Video Quality Evaluator. 5799-5809 - Zuowen Wang, Chang Gao, Zongwei Wu, Marcos V. Conde, Radu Timofte, Shih-Chii Liu, Qinyu Chen, Zhengjun Zha, Wei Zhai, Han Han, Bohao Liao, Yuliang Wu, Zengyu Wan, Zhong Wang, Yang Cao, Ganchao Tan, Jinze Chen, Yan Ru Pei, Sasskia Brüers, Sébastien M. Crouzet, Douglas McLelland, Olivier Coenen, Baoheng Zhang, Yizhao Gao, Jingyuan Li, Hayden Kwok-Hay So, Philippe Bich
, Chiara Boretti, Luciano Prono, Mircea Lica, David Dinucu-Jianu, Catalin Grîu, Xiaopeng Lin, Hongwei Ren, Bojun Cheng, Xinan Zhang, Valentin Vial, Anthony Yezzi, James Tsai:
Event-Based Eye Tracking. AIS 2024 Challenge Survey. 5810-5825 - Marcos V. Conde, Saman Zadtootaghaj, Nabajeet Barman, Radu Timofte, Chenlong He, Qi Zheng
, Ruoxi Zhu, Zhengzhong Tu, Haiqiang Wang, Xiangguang Chen, Wenhui Meng, Xiang Pan, Huiying Shi, Han Zhu, Xiaozhong Xu, Lei Sun, Zhenzhong Chen, Shan Liu, Zicheng Zhang, Haoning Wu, Yingjie Zhou, Chunyi Li, Xiaohong Liu, Weisi Lin, Guangtao Zhai, Wei Sun, Yuqin Cao, Yanwei Jiang, Jun Jia, Zhichao Zhang, Zijian Chen, Weixia Zhang, Xiongkuo Min, Steve Göring, Zihao Qi, Chen Feng:
AIS 2024 Challenge on Video Quality Assessment of User-Generated Content: Methods and Results. 5826-5837 - Marcos V. Conde, Zhijun Lei, Wen Li, Ioannis Katsavounidis, Radu Timofte, Min Yan, Xin Liu, Qian Wang, Xiaoqian Ye, Zhan Du, Tiansen Zhang, Zhiyuan Li, Hao Wei, Chenyang Ge, Jiangtao Lv, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Menghan Zhou, Yiqiang Yan, Kihwan Yoon, Ganzorig Gankhuyag, Jae-Hyeon Lee, Ui-Jin Choi, Hyeon-Cheol Moon, Tae Hyun Jeong, Yoonmo Yang, Jae-Gon Kim, Jinwoo Jeong, Sunjei Kim, Xintao Qiu, Yuanbo Zhou, Kongxian Wu, Xinwei Dai, Hui Tang, Wei Deng, Qingquan Gao, Tong Tong, Long Peng, Jiaming Guo, Xin Di, Bohao Liao, Zhibo Du, Peize Xia, Renjing Pei, Yang Wang, Yang Cao, Zhengjun Zha, Bingnan Han, Hongyuan Yu, Zhuoyuan Wu, Cheng Wan, Yuqing Liu, Haodong Yu, Jizhe Li, Zhijuan Huang, Yuan Huang, Yajun Zou, Xianyu Guan, Qi Jia, Heng Zhang, Xuanwu Yin, Kunlong Zuo, Dongyang Zhang, Tianle Liu, Huaian Chen, Yi Jin:
Real-Time 4K Super-Resolution of Compressed AVIF Images. AIS 2024 Challenge Survey. 5838-5856 - William Avery, Mustafa Munir, Radu Marculescu:
Scaling Graph Convolutions for Mobile Vision. 5857-5865 - Anshul Nasery, Hardik Shah, Arun Sai Suggala, Prateek Jain:
End-to-End Neural Network Compression via l1/l2 Regularized Latency Surrogates. 5866-5877 - Molin Zhang, Soumendu Majee, Chengyu Wang, Seok-Jun Lee, Hamid R. Sheikh:
CoDISP: Exploring Compressed Domain Camera ISP with RGB-guided Encoder. 5878-5888 - Nadhira Noor
, Fabianaugie Jametoni, Jinbeom Kim, Hyunsu Hong, In Kyu Park:
Efficient Skeleton-Based Action Recognition for Real-Time Embedded Systems. 5889-5897 - Nikolina Kubiak, Armin Mustafa, Graeme Phillipson, Stephen Jolly, Simon Hadfield:
S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal. 5898-5908 - Joonsoo Kim, Zhe Zhu, Tien Bau, Chenguang Liu:
DCDR-UNet: Deformable Convolution Based Detail Restoration via U-shape Network for Single Image HDR Reconstruction. 5909-5918 - Xu Ouyang, Ying Chen, Kaiyue Zhu, Gady Agam:
Image restoration refinement with Uformer GAN. 5919-5928 - Ziyan Chen, Jingwen He, Xinqi Lin, Yu Qiao, Chao Dong:
Towards Real-world Video Face Restoration: A New Benchmark. 5929-5939 - Sanghyun Kim
, Min Jung Lee, Woohyeok Kim, Deunsol Jung, Jaesung Rim, Sunghyun Cho, Minsu Cho:
Burst Image Super-Resolution with Base Frame Selection. 5940-5949