


default search action
26th ISM 2024: Tokyo, Japan
- IEEE International Symposium on Multimedia, ISM 2024, Tokyo, Japan, December 11-13, 2024. IEEE 2024, ISBN 979-8-3315-1111-1
- Subhadra Gopalakrishnan, Trisha Mittal, Jaclyn Pytlarz, Yuheng Zhao:
S2MGen: A synthetic skin mask generator for improving segmentation. 1-8 - Yi-Chieh Wu, Yu-Jung Hsu:
Generating and Evaluating Cursive Chinese Calligraphy by Semi-Classifying Style: A Case Study Using a Diffusion Model. 9-16 - Yassine Belkhouche, AlaaIdin Dwaik:
StegoFusion-Net: Fusion of Convolutional Neural Networks for Spatial Image Steganalysis. 17-23 - Hisayoshi Kaneda, Ryota Kawamata, Kazuyoshi Yamazaki, Kazuya Shimizu:
Disparity Correction Method of the Monocular Omnidirectional Stereo Camera. 24-25 - Wen-Hung Liao, Po-Han Chen, Yi-Chieh Wu:
Unveiling the Potential of SSL-Generated Audio Embeddings for Cross-Lingual Speaker Recognition. 26-32 - Di Hu, Katunobu Ito:
Two-stage instrument timbre transfer method using RAVE. 33-40 - Aoi Ito, Katunobu Itou:
Speaker Pseudonymization for Japanese Speech Using Duration Embeddings. 41-48 - Duc V. Nguyen, Quang Long Nguyen, Tran Thuy Hien, Nguyen Ngoc Huyen, Truong Thu Huong, Pham Ngoc Nam:
Modeling User Quality of Experience in Adaptive Point Cloud Video Streaming. 49-54 - Steve Göring, Rasmus Merten, Alexander Raake:
Appeal prediction for AI up-scaled Images. 55-62 - Tailai Song, Paolo Garza, Michela Meo, Maurizio Matteo Munafò:
Modelling Concurrent RTP Flows for End-to-end Predictions of QoS in Real Time Communications. 63-70 - Sushant Gautam, Mehdi Houshmand Sarkhoosh, Jan Held, Cise Midoglu, Anthony Cioppa, Silvio Giancola, Vajira Thambawita, Michael A. Riegler, Pål Halvorsen, Mubarak Shah:
SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset. 71-78 - Peter O. Fasogbon:
Ensuring Color Consistency in RGB-D Multi-Camera Setup. 79-84 - Ahmadreza Sezavar, Catarina Brites, João Ascenso:
Low Complexity Learning-based Lossless Event-based Compression. 85-92 - Håkon Maric Solberg, Mehdi Houshmand Sarkhoosh, Sushant Gautam, Saeed Shafiee Sabet, Pål Halvorsen, Cise Midoglu:
PlayerTV: Advanced Player Tracking and Identification for Automatic Soccer Highlight Clips. 93-97 - Wei Zhang, Victor Soares Bursztyn:
Flexible And Faithful Data Insights Generation. 98-105 - Shuntaro Masuda, Toshihiko Yamasaki:
Holistic Visualization of Contextual Knowledge in Hotel Customer Reviews Using Self-Attention. 106-109 - Wen-Hung Liao, Yang-Jing Lin:
Investigation of Feature Distribution and Network Weight Updates in the Machine Unlearning Process. 110-113 - Greeshma Sree Parimi, Gurkirat Singh Guliani, Min Chen:
Platform for Endangered Language Education. 114-115 - Clément Saint-Marc, Katunobu Itou:
Homophonic Music Composition Using a GAN and LSTM Pipeline for Melody and Harmony Generation. 116-119 - Yuhuan Wang, Katunobu Itou:
Instrumentality Classification Evaluation System for Natural Sounds*. 120-123 - Tomoo Kouzai, Junya Koguchi, Tetsuro Kitahara:
Generating Bass Phrases from Guitar Chord Backing with NMF. 124-125 - Jakub Kovác, Wolfgang Hürst:
Watch your back! Dynamic thumbnails for a 360-degree video player to enhance viewing experience on 2D displays. 126-132 - Daichi Arai, Yuichi Kondo, Yasuko Sugito, Yuichi Kusakabe:
Influence of Display Devices and Field of View on Subjective Quality of Experience Evaluation of 8K 360° Videos. 133-136 - Serkan Sulun, Paula Viana, Matthew E. P. Davies:
VEMOCLAP: A video emotion classification web application. 137-140 - Zhikai Liu, Kun Zhang, Xin-Yi Cui, Wei Sun, Fan Liang:
A Power-Law Transformation Approach for Template-Based Cross-Component Prediction. 141-142 - Dominik Keller, Paul Rudi Frank, Steve Göring, Alexander Raake:
Investigating the Impact of High Frame Rate on Video Quality: A SAMVIQ Approach. 143-144 - Tran Gia Minh, Truong Thu Huong, Duc V. Nguyen:
A Server-driven View-aware Point Cloud Video Streaming Framework. 145-148 - Pedro Martin
, António Rodrigues, João Ascenso, Maria Paula Queluz:
Evaluation of strategies for efficient rate-distortion NeRF streaming. 149-153 - Yumeka Chujo, Yusuke Tagashira, Yukiko Harada, Kenji Kanai, Jiro Katto:
Perceptual Quality Driven Point Cloud Compression for 6DoF 3D Point Cloud Streaming. 154-157 - Yuriy A. Reznik, Guillem Cabrera:
On Multi-CDN Delivery Costs Optimization Problem. 158-161 - Geerthan Srikantharajah, Naimul Khan:
Sliding Window Check: Repairing Object Identities. 162-169 - Genta Matsukawa, Atsuo Yoshitaka:
Data Augmentation with Diffusion Model for Hand Detection. 170-173 - Keita Yamane, Akira Kitayama, Keigo Hasegawa, Yusuke Obonai, Hiroto Sasao:
AI Maintenance Techniques by Detecting Performance Degradation in Domain Shift Using Model Ensembles. 174-175 - Raphael Waltenspül, Florian Spiess, Heiko Schuldt:
Cross-Modal 3D Model Retrieval. 176-180 - Takumi Komori, Takahiro Hayashi:
Prevention of Unexpected Object Generation in Diffusion Model-Based Inpainting. 181-184 - Maria Tzelepi, Vasileios Mezaris:
LMM-Regularized CLIP Embeddings for Image Classification. 185-188 - Kolja Kieslich, Louay Bassbouss, Stephan Steglich, Stefan Arbanowski:
Evaluation Framework for Novel View Synthesis. 189-192 - Jussif J. Abularach Arnez, Cassio A. Tavares Alves, Wederson Medeiros Silva, Isaac Barros Gomes, Carla Lapa Nogueira, Maria G. Lima Damasceno:
A Simulation for the Evaluation of the Mean Opinion Score (MOS) for EVS-WB and AMR-WB Audio Codecs for 5G Mobile Networks. 193-196 - John Li, Deepak Nair, Klara Nahrstedt, Indranil Gupta, Shehab Sarar Ahmed:
FrameCorr: Adaptive, Autoencoder-based Neural Compression for Video Reconstruction in Resource and Timing Constrained Network Settings. 197-200 - Yasuhiro Mochida, Takuro Yamaguchi, Hirokazu Takahashi, Koichi Takasugi:
Ultra-low-latency 8K120p-video-transmission System Parallelizing SMPTE ST 2110. 201-202 - Takuro Yamaguchi, Yasuhiro Mochida, Hirokazu Takahashi:
Low-latency Software-based Uncompressed Video Transmission. 203-204 - Pengcheng Zeng, Atsuo Yoshitaka:
Visual Speech Recognition with Surrounding and Emotional Information. 205-212 - John Murray, Michael Zink:
Synchronized Object Sharing for Augmented Reality Virtual Conferencing. 213-218 - Viviana Crescitelli, Takashi Oshima:
Fusion-Based Human Pose Estimation Using RGB and IR Images with Transformer-Based Decoding. 219-220 - Kin Ching Lydia Chau, Zhi Yu, Ruowei Jiang:
Occlusion-Aware Real-Time Tiny Facial Alignment Model for Makeup Virtual Try-On. 221-224 - Nan Bu, Kakeru Nakano:
A Study on Mental Stress Test using Cybersickness caused by Virtual Reality Contents. 225-226 - Jana Motowilowa, Maurizio Vergari, Tanja Kojic, Maximilian Warsinke, Sebastian Möller, Jan-Niklas Voigt-Antons:
Exploring Augmented Table Setup and Lighting Customization in a Simulated Restaurant to Improve the User Experience. 227-231 - Pedro Baptista de Castro, Hiroko Sukeda, Soichi Takashige:
Human-in-the-loop knowledge base upkeep for retrieval augmented generation applications. 232-233 - Hannes Fassold:
LiveSkeleton: High-Quality Real-Time Human Tracking and Pose Estimation. 234-235 - Florian Schimanke, Robert Mertens, Felix Prankel:
A technical Concept for enhancing the Student Experience in Hybrid Lecture Scenarios. 236-241 - Ryota Kishimoto, Shuhei Tsuchida, Tsutomu Terada, Masahiko Tsukamoto:
SpotiView: Partial Face Display Method for Smooth Communication While Protecting Privacy. 242-249 - Rajini Chittimalla, Sujung Choi, Madhu Sai Vineel Reka, Yassine Belkhouche:
Characterizing students behavior in multi-user multi-computer testing environments. 250-254 - Alexander Gantikow, Andreas Isking, Wolfgang Müller, Paul Libbrecht, Sandra Rebholz:
Evaluating Interactive Concept Maps Produced from E-Portfolios. 255-260 - Gabriel Valerio-Ureña, Giomara Sevilla-Campoverde, Soledad Ortúzar, Christian Lazcano:
Gender Stereotypes in the Creation of Educational Cases with ChatGPT. 261-266 - Karam Dawoud, Birgit Nierula, Farelle Toumaleu Siewe, Thomas Koch, Daniel Johannes Meyer, Andreas Bock, Marianne Heinze, Daniela Knuth, Denis Martin, Julia Schander, Anna Hilsmann, Peter Eisert, Sebastian Bosse:
Multi-View Gesture Recognition in Conflict Situations. 267-268 - Mario Wolf, Sebastian Hartwig, Gregor Steinhöfel, Heinrich Söbke, Eckhard Kraft:
PanoramaViewer - A Framework for Educational Collaborative Virtual Field Trips. 269-274 - Yusuke Maeda, Takahiro Hayashi:
Real-time Multi-modal Highlight Prediction for Simultaneous Viewing of Multiple Live Streams. 275-278 - Itsuki Sano, Yuanyuan Wang, Yukiko Kawai, Kazutoshi Sumiya:
Slide Analysis Method for Editing Lecture Materials based on Hierarchical Structures of Subject Terminologies. 279-284 - Boris Ruf, Marcin Detyniecki:
The ≪Huh?≫ Button: Improving Understanding in Educational Videos with Large Language Models. 285-289

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.