


default search action
WACV 2025: Tucson, AZ, USA - Workshops
- IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2025 - Workshops, Tucson, AZ, USA, February 28 - March 4, 2025. IEEE 2025, ISBN 979-8-3315-3662-6
- Josep López Camuñas, Cristina Bustos, Yanjun Zhu, Raquel Ros, Àgata Lapedriza:
Experimenting with Affective Computing Models in Video Interviews with Spanish-Speaking Older Adults. 1-10 - Ayda Eghbalian, Md Mushfiqur Azam, Katie Holloway, Leslie Neely, Kevin Desai:
Applying Computer Vision to Analyze Self-Injurious Behaviors in Children with Autism Spectrum Disorder. 11-20 - Jisoo Lee, Tamim Ahmed, Thanassis Rikakis, Pavan K. Turaga:
Automatic Temporal Segmentation for Post-Stroke Rehabilitation: A Keypoint Detection and Temporal Segmentation Approach for Small Datasets. 21-29 - William Heyden, Habib Ullah, Muhammad Salman Siddiqui, Fadi Al Machot
:
SEER-ZSL: Semantic Encoder-Enhanced Representations for Generalized Zero-Shot Learning. 30-41 - Shayda Moezzi, Michael Wan, Sai Kumar Reddy Manne, Amal Mathew, Shaotong Zhu, Bishoy Galoaa, Elaheh Hatamimajoumerd, Emma C. Grace, Cassandra B. Rowan, Emily Zimmerman, Briana J. Taylor, Marie J. Hayes, Sarah Ostadabbas:
Classification of Infant Sleep-Wake States from Natural Overnight In-Crib Sleep Videos. 42-51 - Yanjun Zhu, Chen Bai, Cheng Lu, David S. Doermann, Àgata Lapedriza:
UniMotion: Bridging 2D and 3D Representations for Human Motion Prediction. 52-62 - Kuan-Wei Tseng, Rei Kawakami, Satoshi Ikehata, Ikuro Sato:
CST: Character State Transformer for Object-Conditioned Human Motion Prediction. 63-72 - Somaieh Amraee, Eva Blanco-Mallo, Deniz Erdogmus, Aston McCullough, Matthew S. Goodwin, Sarah Ostadabbas:
Advancing Multi-Person Tracking for Autism Behavior Analysis: Challenges, Opportunities, and Future Directions in Clinical Settings. 73-82 - Rasel Ahmed Bhuiyan, Mateusz Trokielewicz, Piotr Maciejewicz, Sherri L. Bucher, Adam Czajka:
Iris Recognition for Infants. 83-92 - Tejas Anvekar, Shivanand Venkanna Sheshappanavar:
Mahalanobis k-NN: A Statistical Lens for Robust Point-Cloud Registrations. 93-102 - Jeonghwan Lee, Heywon Yun, Jimin Kim, Homa Fashandi:
Improving Human Pose-Conditioned Generation: Fine-Tuning ControlNet Models with Reinforcement Learning. 103-112 - Jiaojiao Ye, Zhen Wang, Linnan Jiang:
PQD: Post-Training Quantization for Efficient Diffusion Models. 113-119 - Zhiqiang Lao, Yu Guo, Xiyun Song, Yubin Zhou, Zongfang Lin, Heather Yu, Liang Peng:
High-Fidelity 4× Neural Reconstruction of Real-time Path Traced Images. 120-129 - Zilong Wu, Hideki Murata, Nayu Takahashi, Qiyu Wu, Yoshimasa Tsuruoka:
LatentPS: Image Editing Using Latent Representations in Diffusion Models. 130-139 - Gabriella Pangelinan, Grace Bezold, Haiyu Wu, Michael C. King, Kevin W. Bowyer:
Lights, Camera, Matching: The Role of Image Illumination in Fair Face Recognition. 140-149 - Jens Duym, José Oramas, Ali Anwar:
Quantifying Generative Stability: Mode Collapse Entropy Score for Mode Diversity Evaluation. 150-159 - Dennis Menn, Feng Liang, Hung-Yueh Chiang, Diana Marculescu
:
Similarity Trajectories: Linking Sampling Process to Artifacts in Diffusion-Generated Images. 160-169 - Ha Thu Nguyen, Seyed Ali Amirshahi, Katrien De Moor, Mohamed-Chaker Larabi:
A Distortion Aware Image Quality Assessment Model. 170-179 - Sai Tarun Inaganti, Gennady Petrenko:
MambaTron: Efficient Cross-Modal Point Cloud Enhancement Using Aggregate Selective State Space Modeling. 180-190 - Hao Wang, Xiwen Chen, Ashish Bastola, Jiayou Qin, Abolfazl Razi:
Diffusion Prism: Enhancing Diversity and Morphology Consistency in Mask-to-Image Diffusion. 191-200 - Sadia Mubashshira, Kevin Desai:
TE-NeRF: Triplane-Enhanced Neural Radiance Field for Artifact-Free Human Rendering. 201-210 - Tharun Anand, Aryan Garg, Kaushik Mitra:
IP-FaceDiff: Identity-Preserving Facial Video Editing with Diffusion. 211-221 - Varun Biyyala, Bharat Chanderprakash Kathuria, Jialu Li, Youshan Zhang:
SST-EM: Advanced Metrics for Evaluating Semantic, Spatial and Temporal Aspects in Video Editing. 222-231 - Shiman Zhang, Lakshmikar Reddy Polamreddy, Youshan Zhang:
Confident Pseudo-Labeled Diffusion Augmentation for Canine Cardiomegaly Detection. 232-241 - Mahdi Jampour:
Revealing Palimpsests with Latent Diffusion Models: A Generative Approach to Image Inpainting and Handwriting Reconstruction. 242-249 - Zhemin Zhang, Bhavika Patel, Bhavik Patel, Imon Banerjee:
Unsupervised Generative Approach for Anomaly Detection to Enhance the Quality of Unseen Medical Datasets. 250-259 - Zeyun Deng, Joseph Campbell:
Sparse Mixture-of-Experts for Non-Uniform Noise Reduction in MRI Images. 260-268 - Shaurya Singh Rathore, Aravind Shenoy, Krish Didwania, Aditya Kasliwal, Ujjwal Verma:
HipyrNet: Hypernet-Guided Feature Pyramid Network for Mixed-Exposure Correction. 269-277 - Crispian Morris, Nantheera Anantrasirichai, Fan Zhang, David Bull:
DaBiT: Depth and Blur Informed Transformer for Video Focal Deblurring. 278-288 - Avinash Amballa, Gayathri Akkinapalli, Vinitra Muralikrishnan:
LS-GAN: Human Motion Synthesis with Latent-Space GANs. 289-298 - Shrey Vishen, Jatin Sarabu, Saurav Kumar, Chinmay Bharathulwar, Rithwick Lakshmanan, Vishnu Srinivas:
Advancing Super-Resolution in Neural Radiance Fields via Variational Diffusion Strategies. 299-306 - Hamza Karim, Yasin Yilmaz:
Invisibility Cloak: Hiding Anomalies in Videos via Adversarial Machine Learning Attacks. 307-316 - Mehran Adibi Sedeh, Assia Benbihi, Romain Martin, Marianne Clausel, Cédric Pradalier:
AttriVision: Advancing Generalization in Pedestrian Attribute Recognition using CLIP. 317-328 - Ahmed S. Abdelrahman, Mohamed A. Abdel-Aty, Dongdong Wang:
Video-to-Text Pedestrian Monitoring (VTPM): Leveraging Large Language Models for Privacy-Preserve Pedestrian Activity Monitoring at Intersections. 329-338 - Daniel Rossi
, Guido Borghi, Roberto Vezzani
:
TakuNet: An Energy-Efficient CNN for Real-Time Inference on Embedded UAV Systems in Emergency Response Scenarios. 339-348 - Younghan Kim, Kangryun Moon, Yongjun Park, Yonggyu Kim:
Causal Representation-Based Domain Generalization on Gaze Estimation. 349-358 - Jacqueline Kockwelp, Daniel Beckmann, Benjamin Risse:
Human Gaze Improves Vision Transformers by Token Masking. 359-368 - Manuela Kunz, Kathleen C. Fraser, R. Bruce Wallace, Frank Knoefel, Rafik Goubran, Sina Shafiyan, Neil Thomas:
Addressing Age Bias in the Application of Appearance-Based Gaze-Tracking for Older Adults. 369-377 - Harshit, Tolga Tasdizen:
VISTA: A Visual and Textual Attention Dataset for Interpreting Multimodal Models. 378-385 - Felix O'Mahony, Patrick Furst, Michael Drews, Kai Dierkes:
On Segmenting Pupil Contours in Terms of Elliptical Fourier Series. 386-395 - Purvam Jain, Althaf M. Nazar, Salman Siddique Khan, Kaushik Mitra, Praneeth Chakravarthula:
FlatTrack: Eye-Tracking with Ultra-Thin Lensless Cameras. 396-404 - Juyeop Han, Guilherme Venturelli Cavalheiro, Josef Biberstein, Elham Alkabawi, Shahad Alqhatni, Fadwa Alaskar, Eman Bin Khunayn, Sertac Karaman:
CaLiSa-NeRF: Neural Radiance Field with Pinhole Camera Images, LiDAR point clouds and Satellite Imagery for Urban Scene Representation. 405-413 - Nandini Saini, Ashudeep Dubey, Debasis Das, Chiranjoy Chattopadhyay:
Advancing Open-Set Object Detection in Remote Sensing Using Multimodal Large Language Model. 414-421 - Ciem Cornelissen, Sam Leroux, Pieter Simoens:
Adaptive Clustering for Efficient Phenotype Segmentation of UAV Hyperspectral Data. 422-431 - Seyed Mohamad Ali Tousi, Ramy Farag, Jacket Demby's, Gbenga Omotara, John A. Lory, Guilherme N. DeSouza:
A Zero-Shot Learning Approach for Ephemeral Gully Detection from Remote Sensing using Vision Language Models. 432-441 - Fabian Deuser, Wejdene Mansour, Hao Li, Konrad Habel, Martin Werner, Norbert Oswald:
Temporal Resilience in Geo-Localization: Adapting to the Continuous Evolution of Urban and Rural Environments. 442-451 - Rohit Kumar, Tanishq Sharma, Vedanshi Vaghela, Sanjeev K. Jha, Akshay Agarwal:
PrecipFormer: Efficient Transformer for Precipitation Downscaling. 452-460 - Alex Berian, Daniel Brignac, JhihYang Wu, Natnael Daba, Abhijit Mahalanobis:
CrossModalityDiffusion: Multi-Modal Novel View Synthesis with Unified Intermediate Representation. 461-469 - Ori Linial, George Leifman, Yochai Blau, Nadav Sherman, Yotam Gigi, Wojciech Sirko, Genady Beryozkin:
Enhancing Remote Sensing Representations Through Mixed-Modality Masked Autoencoding. 470-479 - Amrita Gupta, Anthony Ortiz, Simone Fobi Nsutezo, Duncan Kebut, Seema Iyer, Rahul Dodhia, Juan M. Lavista Ferres:
Mapping Refugee Camps with AI: A Benchmark Dataset and Baseline Models for Humanitarian Applications. 480-488 - Heng Fang, Hossein Azizpour:
Leveraging Satellite Image Time Series for Accurate Extreme Event Detection. 489-498 - Shunsuke Takao
:
MD-Glow: Multi-task Despeckling Glow for SAR Image Enhancement. 499-506 - Martina Pastorino, Gabriele Moser, Sebastiano B. Serpico, Josiane Zerubia:
Multiresolution Fusion and Classification of Hyperspectral and Panchromatic Remote Sensing Images. 507-516 - Nicolas Houdré, Diego Marcos, Dino Ienco, Laurent Wendling, Camille Kurtz, Sylvain Lobry:
ProMM-RS: Exploring Probabilistic Learning for Multi-Modal Remote Sensing Image Representations. 517-525 - Daniel Panangian, Ksenia Bittner:
Dfilled: Repurposing Edge-Enhancing Diffusion for Guided DSM Void Filling. 526-534 - Michael J. Bianco, David Eigen, Michael Gormish:
Enhancing Worldwide Image Geolocation by Ensembling Satellite-Based Ground-Level Attribute Predictors. 535-543 - Hongcheng Jiang, ZhiQiang Chen:
Hyperspectral Pansharpening with Transformer-Based Spectral Diffusion Priors. 544-553 - Mariya Jose, Stefan Auer, Jiaojiao Tian:
Direction-Guided Segmentation and Vectorisation of Curbstones from High-Resolution Ortho-Images. 554-561 - Aaron Perez, Saurabh Prasad:
Layer Optimized Spatial Spectral Masked Autoencoder for Semantic Segmentation of Hyperspectral Imagery. 562-570 - Takayuki Shinohara:
Pre-Training of Auto-Generated Synthetic 3D Point Cloud Segmentation for Outdoor Scenes. 571-580 - Justin McMillen, Yasin Yilmaz:
FuseForm: Multimodal Transformer for Semantic Segmentation. 581-590 - Harleen Hanspal, Alessandro De Palma, Alessio Lomuscio:
Robustness to Perturbations in the Frequency Domain: Neural Network Verification and Certified Training. 591-600 - Simen Cassiman, Marc Proesmans, Tinne Tuytelaars, Luc Van Gool:
Model Weights Reflect a Continuous Space of Input Image Domains. 601-610 - Lukás Picek, Vojtech Cermák, Marek Hanzl:
Zero-Shot Hazard Identification in Autonomous Driving: A Case Study on the COOOL Benchmark. 611-620 - Mahdi Abbariki, Maged Shoman:
Interpreting the Unexpected: A Multimodal Framework for Out-of-Label Hazard Detection and Explanation in Autonomous Driving. 621-628 - Akshat Ghiya, Ali K. AlShami, Jugal Kalita:
SGNetPose+: Stepwise Goal-Driven Networks with Pose Information for Trajectory Prediction in Autonomous Driving. 629-637 - Parisa Hatami, Maged Shoman, Mina Sartipi:
Open-World Hazard Detection and Captioning for Autonomous Driving with a Unified Multimodal Pipeline. 638-646 - Sotirios Stamnas, Victor Sanchez:
DiffFake: Exposing Deepfakes Using Differential Anomaly Detection. 647-657 - Konstantinos Tsigos, Evlampios Apostolidis, Vasileios Mezaris:
Improving the Perturbation-Based Explanation of Deepfake Detectors Through the Use of Adversarially-Generated Samples. 658-667 - Musab Al-Ghadi, Joris Voerman, Mickaël Coustaty, Olivier Lessard, Nicolas Sidere:
IDTrust: Deep Identity Document Quality Detection with Bandpass Filtering. 668-675 - Anant Mehta, Bryant McArthur, Nagarjuna Kolloju, Zhengzhong Tu:
HFMF: Hierarchical Fusion Meets Multi-Stream Models for Deepfake Detection. 676-685 - Hongyang Xie, Hongyang He, Boyang Fu, Victor Sanchez:
GrDT: Towards Robust Deepfake Detection Using Geometric Representation Distribution and Texture. 686-696 - Guray Ozgur, Eduarda Caldeira, Tahar Chettaoui, Fadi Boutros, Raghavendra Ramachandra, Naser Damer:
FoundPAD: Foundation Models Reloaded for Face Presentation Attack Detection. 697-707 - Philip Wootaek Shin, Jack Sampson, Vijaykrishnan Narayanan, Andres Marquez, Mahantesh Halappanavar:
Disharmony: Forensics Using Reverse Lighting Harmonization. 708-717 - Marco Huber, Pedro C. Neto, Ana Filipa Sequeira, Naser Damer:
FX-MAD: Frequency-Domain Explainability and Explainability-Driven Unsupervised Detection of Face Morphing Attacks. 718-728 - Giovanna Maria Dimitri, Benedetta Tondi, Mauro Barni:
Enhancing Synthetic Generated-Images Detection through Post-Hoc Calibration. 729-736 - Giovanni Pio Delvecchio, Huy H. Nguyen, Isao Echizen:
Zero-Shot Warning Generation for Misinformative Multimodal Content. 737-746 - Emanuele Mule, Matteo Pannacci, Ali Ghasemi Goudarzi, Francesco Pro, Lorenzo Papa, Luca Maiano, Irene Amerini:
Enhancing Ground-to-Aerial Image Matching for Visual Misinformation Detection Using Semantic Segmentation. 747-755 - Stephane F. Schwarz, Paulo Fonseca, Anderson Rocha:
Zero-Training Fraud Detection in a Large Messaging Platform? 756-764 - Safwen Naimi, Wassim Bouachir, Guillaume-Alexandre Bilodeau, Brian L. Mishara:
SSTAR: Skeleton-Based Spatio-Temporal Action Recognition for Intelligent Video Surveillance and Suicide Prevention in Metro Stations. 765-775 - Iñaki Erregue, Kamal Nasrollahi, Sergio Escalera:
YOLO11-JDE: Fast and Accurate Multi-Object Tracking with Self-Supervised Re-ID. 776-785 - Michel van Lier, Martin Van Leeuwen, Bastian Van Manen, Leo Kampmeijer, Nicolas Boehrer:
Evaluation of Spatio-Temporal Small Object Detection in Real-World Adverse Weather Conditions. 786-797 - Marc Oliu, Rohat Bozyil, Mia Sandra Nicole Siemon, Alejandro Martinez-Senent, Sergio Escalera, Thomas B. Moeslund, Kamal Nasrollahi:
Fall Detection: Leveraging Depth Information in Bayesian Networks. 798-805 - Anastasija Manojlovska, Raghavendra Ramachandra, Georgios Spathoulas, Vitomir Struc, Klemen Grm:
Interpreting Face Recognition Templates Using Natural Language Descriptions. 806-815 - Alexandru Niculescu-Mizil, Deep Patel, Iain Melvin:
MCTR: Multi Camera Tracking Transformer. 816-826 - Xiwen Li, Rehman Mohammed, Tristalee Mangin, Surojit Saha, Kerry E. Kelly, Ross T. Whitaker, Tolga Tasdizen:
Joint Audio-Visual Idling Vehicle Detection with Streamlined Input Dependencies. 827-836 - Amey Noolkar, Victor Sanchez:
Simultaneous Multi-Object Multi-Camera Trajectory Forecasting (SMO-MCTF). 837-843 - Nikolaos Kaparinos, Vasileios Mezaris:
B-FPGM: Lightweight Face Detection via Bayesian-Optimized Soft FPGM Pruning. 844-853 - Gopi Raju Matta, Reddypalli Trisha, Kaushik Mitra:
BeSplat: Gaussian Splatting from a Single Blurry Image and Event Stream. 854-864 - Himanshu Kumar, Aniket Konkar:
Simple Transformer with Single Leaky Neuron for Event Vision. 865-871 - Thilo Reinold, Suman Ghosh
, Guillermo Gallego:
Combined Physics and Event Camera Simulator for Slip Detection. 872-880 - JongHun Park, MunHo Hong:
Continuous Histogram for Event-Based Vision Camera Systems. 881-889 - Ross Greer, Bjørk Antoniussen, Andreas Møgelmose, Mohan M. Trivedi:
Language-Driven Active Learning for Diverse Open-Set 3D Object Detection. 890-898 - Xuewen Luo, Fan Ding, Fengze Yang, Yang Zhou, Junnyong Loo, Hwa Hui Tew, Chenxi Liu:
SenseRAG: Constructing Environmental Knowledge Bases with Proactive Querying for LLM-Based Autonomous Driving. 899-906 - Athira Krishnan R, Sumukha BG, Ambarish Parthasarathy:
Glimpse of MCQ Based VQA in Road & Traffic Scenarios. 907-910 - Shuo Xing, Chengyuan Qian, Yuping Wang, Hongyuan Hua, Kexin Tian, Yang Zhou, Zhengzhong Tu:
OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving. 911-919 - Yunsheng Ma, Wenqian Ye, Can Cui, Haiming Zhang, Shuo Xing, Fucai Ke, Jinhong Wang, Chenglin Miao, Jintai Chen, Hamid Rezatofighi, Zhen Li, Guangtao Zheng, Chao Zheng, Tianjiao He, Manmohan Chandraker, Burhaneddin Yaman, Xin Ye, Hang Zhao, Xu Cao:
Position: Prospective of Autonomous Driving - Multimodal LLMs, World Models, Embodied Intelligence, AI Alignment, and Mamba. 920-936 - Aryan Keskar, Srinivasa Perisetla, Ross Greer:
Evaluating Multimodal Vision-Language Model Prompting Strategies for Visual Question Answering in Road Scene Understanding. 937-946 - Esteban Rivera, Jannik Lübberstedt, Nico Uhlemann, Markus Lienkamp:
Scenario Understanding of Traffic Scenes Through Large Visual Language Models. 947-955 - Cagri Gungor, Adriana Kovashka:
Enhancing Weakly-Supervised Object Detection on Static Images Through (Hallucinated) Motion. 956-960 - Amirhosein Chahe, Lifeng Zhou:
Query3D: LLM-Powered Open-Vocabulary Scene Segmentation with Language Embedded 3D Gaussians. 961-970 - Liang Shi, Boyu Jiang, Tong Zeng, Feng Guo:
ScVLM: Enhancing Vision-Language Model for Safety-Critical Event Understanding. 971-981 - Mao Ye, Gregory P. Meyer, Zaiwei Zhang, Dennis Park, Siva Karthik Mustikovela, Yuning Chai, Eric M. Wolff:
VLMine: Long-Tail Data Mining with Vision Language Models. 982-992 - Zhengye Yang, Richard J. Radke:
Detecting Contextual Anomalies by Discovering Consistent Spatial Regions. 993-1002 - Furkan Mumcu, Michael J. Jones, Yasin Yilmaz, Anoop Cherian:
ComplexVAD: Detecting Interaction Anomalies in Video. 1003-1012 - Haydar Mehryar, Chengzhi Mao, Loren Schwiebert:
ML-JET: A Benchmark Dataset for Classifying Heavy Ion Collisions. 1013-1022 - Yiming Che, Fazle Rafsani, Jay Shah
, Md Mahfuzur Rahman Siddiquee, Teresa Wu:
AnoFPDM: Anomaly Detection with Forward Process of Diffusion Models for BrainMRI. 1023-1032 - Narges Rashvand, Ghazal Alinezhad Noghre, Armin Danesh Pazho, Shanle Yao, Hamed Tabkhi:
Exploring Pose-Based Anomaly Detection for Retail Security: A Real-World Shoplifting Dataset and Benchmark. 1033-1041 - Robert F. Maack, Lars Thun, Thomas Liang, Hasan Tercan, Tobias Meisen:
PCAD: A Real-World Dataset for 6D Pose Industrial Anomaly Detection. 1042-1051 - Yuto Kumamoto, Kento Ohtani, Daiki Suzuki, Minori Yamataka, Kazuya Takeda:
AAT-DA: Accident Anticipation Transformer with Driver Attention. 1052-1061 - Nico Uhlemann, Yipeng Zhou, Tobias Simeon Mohr, Markus Lienkamp:
Snapshot: Towards Application-Centered Models for Pedestrian Trajectory Prediction in Urban Traffic Environments. 1062-1072 - Xuewen Luo, Fan Ding, Ruiqi Chen, Rishikesh Panda, Junnyong Loo, Shuyun Zhang:
"What's Happening"- A Human-centered Multimodal Interpreter Explaining the Actions of Autonomous Vehicles. 1073-1080 - Tayssir Bouraffa, Dimitrios Koutsakis, Salvija Zelvyte:
Deep Learning-based rPPG Models Towards Automotive Applications: A Benchmark Study. 1081-1090 - Eleanor Byler, Christian Svinth, Kirsten Chojnicki:
Location Generalizability of Image-Based Air Quality Models. 1091-1100 - Murat Osswald, Louis Niederlohner, Sascha Köjer, Tobias Ziedorn, Valerio Gulli, Michael Mommert, Helmut Mayer:
FineAir: Finest-grained Airplanes in High-resolution Satellite Images. 1101-1109 - Sara Shojaei, Trevor Bohl, Kannappan Palaniappan, Filiz Bunyak:
Adaptive Structure-Aware Connectivity-Preserving Loss for Improved Road Segmentation in Remote Sensing Images. 1120-1128 - Marvin Burges, Philipe Ambrozio Dias, Carson Woody, Sarah Walters, Dalton D. Lunga:
Interactive Rotated Object Detection for Novel Class Detection in Remotely Sensed Imagery. 1129-1137 - Diego A. Velázquez, Pau Rodríguez López, Sergio Alonso, Josep M. Gonfaus, Jordi Gonzàlez
, Gerardo Richarte, Javier Marin, Yoshua Bengio, Alexandre Lacoste:
EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision. 1138-1147 - Valentin Wagner, Sebastian Bullinger, Christoph Bodensteiner, Michael Arens:
Semantic Neural Radiance Fields for Multi-Date Satellite Data. 1148-1156 - Robin Schön, Julian Lorenz, Daniel Kienzle, Rainer Lienhart:
SkipClick: Combining Quick Responses and Low-Level Features for Interactive Segmentation in Winter Sports Contexts. 1157-1166 - Edoardo Bianchi, Oswald Lanz:
Gate-Shift-Pose: Enhancing Action Recognition in Sports with Skeleton Information. 1167-1174 - Maria Koshkina, James H. Elder:
Towards Long-Term Player Tracking with Graph Hierarchies and Domain-Specific Features. 1175-1185 - Davide Cavicchini, Alessia Pivotto, Sofia Lorengo, Andrea Rosani, Nicola Garau:
CLaP - Contrast, Label, Predict: A Quest for Cheaper Labeling in 3D Human Pose Estimation. 1186-1194 - Tushar Shinde, Shivam Bhardwaj:
Mixed-Precision is All You Need for Efficient Document Image Classification. 1195-1203 - Nicolas Klenert, Finn Schwoerer, Noushin Hajarolasvadi, Siloé Bournez, Tobias Arlt, Heinz-Eberhard Mahnke, Verena Lepper, Daniel Baum:
Improving the Identification of Layers in 3D Images of Ancient Papyrus Using Artificial Neural Networks. 1204-1212 - Phan Phuong Mai Chau, Souhail Bakkali, Antoine Doucet:
DocSum: Domain-Adaptive Pre-training for Document Abstractive Summarization. 1213-1222 - Vincenzo Armandi, Andrea Loretti, Lorenzo Stacchio, Pasquale Cascarano, Gustavo Marfia:
Multi-Modal Large Language Model Driven Augmented Reality Situated Visualization: The Case of Wine Recognition. 1223-1232 - Lukas Hüttner, Martin Mayr, Thomas Gorges, Fei Wu
, Mathias Seuret, Andreas K. Maier, Vincent Christlein:
Low-Rank Adaptation vs. Fine-Tuning for Handwritten Text Recognition. 1233-1242 - Eliott Thomas, Mickaël Coustaty, Aurélie Joseph, Gaspar Deloin, Elodie Carel, Vincent Poulain D'Andecy, Jean-Marc Ogier:
RAPTOR: Refined Approach for Product Table Object Recognition. 1243-1252 - Iknoor Singh, Miguel Colom, Kalina Bontcheva:
A Comparative Analysis of OCR Models on Diverse Datasets: Insights from Memes and Hiertext Dataset. 1253-1263 - Valentina Arrigoni:
Offline Signature Verification in the Banking Domain. 1264-1273 - Navin Ranjan, Bruno Artacho, Andreas E. Savakis:
WTPose: Waterfall Transformer for Multi-person Pose Estimation. 1274-1281 - Ameni Trabelsi, Maria Zontak, Yiming Qian, Brian Jackson, Suleiman Ali Khan, Umit Batur:
What Matters When Building Vision Language Models for Product Image Analysis? 1282-1291 - Anurag Pandey, Arnav Bhavsar, Aditya Nigam, Divya Acharya, Basu Verma, Balaji Rao K:
SIGN-GAIL: Rewarding Online Signature Generation for Digital Imitation. 1292-1301 - Sander De Coninck, Sam Leroux, Pieter Simoens:
Exploring Correlated Facial Attributes in Text-to-Image Models: Unintended Consequences in Synthetic Face Generation. 1302-1311 - Andrea Ciamarra
, Roberto Caldelli, Alberto Del Bimbo:
On the Generalisation Capability of Local Surface Frames in Detecting Diffusion-Based Facial Images. 1312-1321 - Abhishek Tandon, Geetanjali Sharma, Gaurav Jaswal, Aditya Nigam, Raghavendra Ramachandra:
Generating Realistic Forehead-Creases for User Verification via Conditioned Piecewise Polynomial Curves. 1322-1330 - Elmokhtar Mohamed Moussa, Ioannis Sarridis, Emmanouil Krasanakis, Nathan Ramoly, Symeon Papadopoulos, Ahmad Montaser Awal, Lara Younes:
Face-swapping Based Data Augmentation for ID Document and Selfie Face Verification. 1331-1338 - Arun Kunwar, Ajita Rattani:
Unified Face Matching and Physical-Digital Spoofing Attack Detection. 1339-1349 - Huy Phan, Boshi Huang, Ayush Jaiswal, Ekraam Sabir, Prateek Singhal, Bo Yuan:
Latent Diffusion Shield - Mitigating Malicious Use of Diffusion Models Through Latent Space Adversarial Perturbations. 1350-1358 - Rishabh Shukla
, Harkeerat Kaur, Isao Echizen:
Parallel Prints: Generating Realistic Cancelable Fingerprint Templates. 1359-1368 - Despina Konstantinidou, Christos Koutlis, Symeon Papadopoulos:
TextureCrop: Enhancing Synthetic Image Detection Through Texture-Based Cropping. 1369-1378 - Anjith George, Sébastien Marcel:
Digi2Real: Bridging the Realism Gap in Synthetic Data Face Recognition via Foundation Models. 1379-1388 - Qice Qin, Yuki Hirakawa, Ryotaro Shimizu, Takuya Furusawa, Edgar Simo-Serra:
Fashionability-Enhancing Outfit Image Editing with Conditional Diffusion Models. 1389-1399 - Nikolai Goncharov, Donald G. Dansereau:
Segment Anything in Light Fields for Real-Time Applications via Constrained Prompting. 1400-1406 - Muneeb Ahmed Khan
, Hyungsub Kim, Jiho Eum, Yihyun Myung, Yujin Choi, Heemin Park:
M-GAID: A Real-World Dataset for Ghosting Artifact Detection and Removal in Mobile Imaging. 1407-1416 - Aotian Zheng, Jenq-Neng Hwang, Yudong Liu, Qiancheng Li, Beverly Barnett, Farron Wallace, Irina Benson, Thomas Helser:
Automatic Fish Age Prediction Using Deep Machine Learning: Combining Otolith Image, FT-NIR Spectra and Metadata Features. 1417-1424 - Nenyi Kweku Nkensen Dadson, Corina Barbalata:
Marine Event Vision: Harnessing Event Cameras for Robust Object Detection in Marine Scenarios. 1425-1434 - Evan Lucas, Ali Awad, Anthony Geglio, Ashraf Saleem, Shadi Moradi, Timothy C. Havens, Angus Galloway, Sidike Paheding:
Underwater Image Enhancement and Object Detection: Are Poor Object Detection Results On Enhanced Images Due to Missing Human Labels? 1435-1440 - Md Mahmuddun Nabi Murad, Bora San Turgut, Awwab Ahmed, Gokhan Camliyurt, Yasin Yilmaz:
Camera-based Intruder Detection and Monitoring of Ship Crew Work Hours. 1441-1449 - Matej Fabijanic, Maja Magdalenic, Juraj Obradovic, Nadir Kapetanovic, Fausto Ferreira, Nikola Miskovic:
Vessel Registration Number Detection and Recognition System. 1450-1456 - Benjamin Kiefer, Lojze Zust, Jon Muhovic, Matej Kristan, Janez Pers, Matija Tersek, Uma Mudenagudi, Chaitra Desai, Arnold Wiliem, Marten Kreis, Nikhil Akalwadi, Yitong Quan, Zhiqiang Zhong, Zhe Zhang, Sujie Liu, Xuran Chen, Yang Yang, Matej Fabijanic, Fausto Ferreira, Seongju Lee, Junseok Lee, Kyoobin Lee, Shanliang Yao, Runwei Guan, Xiaoyu Huang, Yi Ni, Himanshu Kumar, Yuan Feng, Yi-Ching Cheng, Tzu-Yu Lin, Chia-Ming Lee, Chih-Chung Hsu, Jannik Sheikh, Andreas Michel, Wolfgang Gross, Martin Weinmann, Josip Saric, Yipeng Lin, Xiang Yang, Nan Jiang, Yutang Lu, Fei Feng, Ali Awad, Evan Lucas, Ashraf Saleem, Ching-Heng Cheng, Yu-Fan Lin:
3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results. 1457-1484 - Ch Muhammad Awais
, Marco Reggiannini, Davide Moroni
:
A Framework for Imbalanced SAR Ship Classification: Curriculum Learning, Weighted Loss Functions, and a Novel Evaluation Metric. 1485-1493 - Rui Yi Yong, Samuel Picosson, Arnold Wiliem:
MTReD: 3D Reconstruction Dataset for Fly-Over Videos of Maritime Domain. 1494-1502 - Massimiliano Ciranni
, Ani Gjergji, Andrea Maracani, Vittorio Murino, Vito Paolo Pastore:
In-domain Self-supervised Learning for Plankton Image Classification on a Budget. 1503-1512 - Stefan Hein Bengtson, Daniel Lehotský, Vasiliki Ismiroglou, Niels Madsen, Thomas B. Moeslund, Malte Pedersen:
AutoFish: Dataset and Benchmark for Fine-Grained Analysis of Fish. 1513-1522 - Tushar Shinde, Avinash Kumar Sharma, Shivam Bhardwaj, Ahmed Silima Vuai:
Navigating Coreset Selection and Model Compression for Efficient Maritime Image Classification. 1523-1531 - Rohan Sawahn, Maximilian Schall:
FalconEye: Efficient Centroid-Based Object Detection for Maritime High Altitude UAV Images on Embedded Devices. 1532-1543 - Elahe Soltandoost, Richard Plesh, Stephanie Schuckers, Peter Peer, Vitomir Struc:
Extracting Local Information from Global Representations for Interpretable Deepfake Detection. 1544-1554 - Muhammad Umar Farooq, Awais Khan, Kutub Uddin, Khalid Mahmood Malik:
Transferable Adversarial Attacks on Audio Deepfake Detection. 1555-1564 - Eduarda Caldeira, Guray Ozgur, Tahar Chettaoui, Marija Ivanovska, Peter Peer, Fadi Boutros, Vitomir Struc, Naser Damer:
MADation: Face Morphing Attack Detection with Foundation Models. 1565-1575 - Lalith Bharadwaj Baru
, Rohit Boddeda, Shilhora Akshay Patel, Sai Mohan Gajapaka:
Wavelet-Driven Generalizable Framework for Deepfake Face Forgery Detection. 1576-1584 - Tushar Shinde:
Model Compression Meets Resolution Scaling for Efficient Remote Sensing Classification. 1576-1584 - Nitish Shukla, Arun Ross:
Metric for Evaluating Performance of Reference-Free Demorphing Methods. 1585-1591 - Ewelina Bartuzi-Trokielewicz, Alicja Martinek, Adrian Kordas:
Face Detection and Recognition Under Real-World Scenarios - Dealing with Deepfake Incidents and Malicious Data Distortions. 1592-1601 - Alain Komaty, Hatef Otroshi-Shahreza, Anjith George, Sébastien Marcel:
Exploring ChatGPT for Face Presentation Attack Detection in Zero and Few-Shot in-Context Learning. 1602-1611

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.