Wednesday December 11, 2024 (Day 4)
[9:00–10:00] Masaru Ibuka Auditorium
"AI Coding Reality and Perspectives"
Dr. Elena Alshina
Chair: Lu Yu (Zhejiang University)
[10:30–12:00] Masaru Ibuka Auditorium
"SS-4: Implicit and Explicit Neural Representations for nD Video Compression"
Chair: Yiyi Liao (Zhejiang University)
[P-ID 219] PET-NeRV: Bridging Generalized Video Codec and Content-Specific Neural Representation
Hao Li (Zhejiang Univerisity); Lu Yu (Zhejiang University); Yiyi Liao (Zhejiang University)*
[P-ID 274] A Practical Approach to Depth-Aware Augmentation for Neural Radiance Fields
Hamed Razavi Khosroshahi (Université libre de Bruxelles (ULB))*; Jaime Sancho (Universidad Politécnica de Madrid); Daniele Bonatto (Université Libre de Bruxelles); Sarah Fachada (Université Libre de Bruxelles); Gun Bang (ETRI); Gauthier Lafruit (ULB-LISA); Eduardo Juarez (Universidad Politécnica de Madrid); Mehrdad Teratani (Université Libre de Bruxelles)
[P-ID 168] Dynamic Volumetric Video Coding with Tensor Decomposition
Juyeon Shin (Ewha W University); Yeoneui Kim (Ewha Womans University); Gun Bang (ETRI); Jewon Kang (Ewha Womans University)*
[P-ID 292] Compressing 3D Gaussian Splatting via a Generalizable Neural Coder
Junteng Zhang (Nanjing University)*; Tong Chen (Nanjing University); Hao Zhu (Nanjing University); Dong Wang (Guangdong OPPO Mobile Telecommunications Corp., Ltd. ); Dandan Ding (Hangzhou Normal University); Zhan Ma (Nanjing University)
[10:30–12:00] Conference Room 1
"Understanding/Recognition/Detection"
Chair: Jiro Katto (Waseda University)
[P-ID 013] IRAD: Input-Reference Joint Driven Reconstruction for Unified Anomaly Detection
Zixin Chen (Shanghai Jiaotong University)*; Xincheng Yao (Shanghai Jiao Tong University); Yan Luo (Shanghai Jiao Tong University); Baozhu Zhang (Ningbo Haitang Information Technology Co., Ltd); Zhenyu Liu (Ningbo Haitang Information Technology Co., Ltd.); Chongyang Zhang (Shanghai Jiao Tong University)
[P-ID 246] Generative Representation and Discriminative Classification for Few-shot Open-set Object Detection
Peixue Shen (Shanghai Jiao Tong University); Ruoqi Li (SJTU); Yan Luo (Shanghai Jiao Tong University); Yiru Zhao (Shanghai Jiao Tong University); Chao Gao (China Pacific Insurance (Group) Co., Ltd.); Chongyang Zhang (Shanghai Jiao Tong University)*
[P-ID 146] Fully Aligned Network for Referring Image Segmentation
Yong Liu (Tsinghua University)*; Ruihao Xu (Tsinghua University); Yansong Tang (Tsinghua University)
[P-ID 034] Localization-Aware Multi-Scale Representation Learning for Repetitive Action Counting
Sujia Wang (Tsinghua University)*; Xiangwei Shen (Tsinghua University); Yansong Tang (Tsinghua University); Xin Dong (Tsinghua University); Wenjia Geng (Shenzhen International Graduate School,Tsinghua University); Lei Chen (Tsinghua University)
[P-ID 259] Performance Evaluation of Feature Detectors and Descriptors with Close-Range Solar Panel Images
Eman Ansar (Carnegie Mellon University in Qatar); Sara Zewil (Carnegie Mellon University in Qatar); Fathimath Zuha Maksood (Carnegie Mellon University in Qatar ); Eduardo Marcelo Feo Flushing (Carnegie Mellon University in Qatar)*
[13:30–15:00] Masaru Ibuka Auditorium
"SS-5: Emerging Trends in Learning-based Image/Video Coding and Perceptual Quality Assessment"
Chair: Yiyi Liao (Zhejiang University)
[P-ID 069] NeRV++: An Enhanced Implicit Neural Video Representation
Ahmed Ghorbel (Ecole polytechnique)*; Wassim Hamidouche (INSA Rennes); Luce Morin (INSA Rennes)
[P-ID 271] Improving Reconstruction Fidelity in Generative Face Video Coding using High Frequency Shuttling
Goluck Konuko (L2S - CentraleSupélec, Université Paris Saclay)*; Giuseppe Valenzise (CNRS, CentraleSupelec); Anthony TRIOUX (Xidian University, School of Telecommunications Engineering, Xi'an China)
[P-ID 273] Characterizing the geometric complexity of G-PCC compressed point clouds
Annalisa Gallina (Università degli Studi di Padova); Hadi Amirpour (University of Klagenfurt); Sara Baldoni (University of Padova)*; Giuseppe Valenzise (CNRS); Federica Battisti (University of Padova)
[P-ID 049] ReLI-QA: A Multidimensional Quality Assessment Dataset for Relighted Human Heads
Yingjie Zhou (Shanghai Jiao Tong University)*; Zicheng Zhang (Shanghai Jiaotong university); Farong Wen (Shanghai Jiaotong university); Jun Jia (Shanghai Jiao Tong University); Xiongkuo Min (Shanghai Jiao Tong University); Jia Wang (Shanghai Jiao Tong University); Guangtao Zhai (Shanghai Jiao Tong University)
[P-ID 187] Quantizing Neural Networks with Knowledge Distillation for Efficient Video Quality Assessment (Best Paper Candidate)
Jiayuan Yu (Zhejiang University); Yingming Li (Zhejiang University)*
[13:30–15:00] Conference Room 1
"Visual Processing/Enhancement/Restoration"
Chair: Xin Jin (Shenzhen International Graduate School, Tsinghua University)
[P-ID 167] Frame Similarity-Based Screen Content Video Quality Enhancement via Adaptive Long Short-Term Fusion
Ziyin HUANG (The Hong Kong Polytechnic University); Yui-Lam Chan ( The Hong Kong Polytechnic University )*; Ngai-Wing Kwong (The Hong Kong Polytechnic University); Sik-Ho Tsang (The Hong Kong Polytechnic University); Kin-Man Lam (The Hong Kong Polytechnic University); Bingo Wing Kuen Ling (Guangdong University of Technology)
[P-ID 014] FSDN: Image frequency and semantic decomposition network for image dehazing
Zongyang Tong (Tsinghua University); Mingyu Liu (Tsinghua University); Xin Jin (Tsinghua University)*
[P-ID 165] Cross-Device Image Saliency Detection: Database and Comparative Analysis
Xiaoying Ding (Zhongnan University of Economics and Law); Guanghui Yue (Shenzhen university); Yingxue Zhang (Tianjin University of Science and Technology)*
[P-ID 180] Lookup Register-Tables with Interpolation for Effective Image Transformation on x86/64 CPUs
Hirokazu Kamei (Nagoya Institute of Technology); Soichiro Honda (Nagoya Institute of Technology); Kohei Hayashi (Nagoya Institute of Technology); Yoshihiro Maeda (Shibaura Institute of Technology); Norishige Fukushima (Nagoya Institute of Technology)*
[P-ID 181] Motion Estimation for Quanta Image Sensors Using Spatio-Temporal Priors
Hiroya Fukawa (Tokyo University of Science)*; Kosuke Kurihara (Tokyo university of science); Yoshihiro Maeda (Shibaura Institute of Technology); Shunichi Sato (Tokyo University of Science); Takayuki Hamamoto (Tokyo University of Science)
[15:00–17:00] Conference Room 2
Chair: Jui-Chiu Rachel Chiang (National Chung Cheng University)
[P-ID 204] Anchoring Vision and Language Knowledge for Weakly Supervised Group Activity Recognition
Muhammad Adi Nugroho (KAIST)*; Jinyoung Park (KAIST); Donguk Kim (KAIST); Changick Kim (KAIST)
[P-ID 277] Pseudo Dataset Generation for Out-of-domain Multi-Camera View Recommendation
Kuan-Ying Lee (University of Illinois at Urbana-Champaign)*; Qian Zhou (University of Illinois at Urbana-Champaign); Klara Nahrstedt (University of Illinois at Urbana-Champaign)
[P-ID 207] Deep Reinforcement Learning-Based Camera Autofocus with Gaussian Process Regression
Li Wei (Shanghai Jiao Tong University)*; Yuankun Jiang (Shanghai Jiao Tong University); Chenglin Li (Shanghai Jiao Tong University); Wenrui Dai (Shanghai Jiao Tong University); Junni Zou (Shanghai Jiao Tong University); Hongkai Xiong (Shanghai Jiao Tong University)
[P-ID 288] High-Fidelity Image Style Transfer by Hybrid Transformers
Zhe-Wei Hsu (National Taipei University of Technology ); Shih-Hsuan Yang (National Taipei University of Technology)*; Bo-Jiun Tung (National Taipei University of Technology)
[P-ID 214] PFT-ILF: In-loop Filter with Partition Feature Transform for Versatile Video Coding
Xin-Yi Cui (School of Electronics and Information Technology, Sun Yat-Sen University)*; Zhikai Liu (Sun Yat-sen University); Zhidao Zhou (Sun Yat-Sen University); Li Chen (School of Electronics and Information Technology, Sun Yat-Sen University); Fan Liang (Sun Yat-sen University)
[P-ID 242] Image Forensics Strikes Back: Defense Against Adversarial Patch
Ching-Chia Kao (National Taiwan University; Academia Sinica); Chun-Shien Lu (Academia Sinica)*; Chia-Mu Yu (National Yang Ming Chiao Tung University)
[P-ID 248] Lightweight Arbitrary-Scale Super-Resolution of Remote Sensing Images via Super-Scale Feature
Yifei Long (Wuhan University); yuantong Zhang (wuhan university); Daiqin Yang (Wuhan University); Zhenzhong Chen (Wuhan University)*; Huairui Wang (Wuhan University); Shan Liu (Tencent America)
[P-ID 251] Predicting total time to compress a video corpus using online inference systems
Xin Shu (Trinity College Dublin)*; Vibhoothi Vibhoothi (Trinity College Dublin); Anil Kokaram (Trinity College Dublin, Ireland)
[P-ID 253] UplinkNet: Practical Commercial 5G Standalone (SA) Uplink Throughput Prediction
Kasidis Arunruangsirilert (Waseda University)*; Jiro Katto (Waseda University)
[P-ID 254] MAESR360: Masked autoencoder-based 360-degree video streaming via multi-scale feature fusion
Li Yu (Nanjing University of Information Science and Technology)*; Zhiyu Pang (Nanjing University of Information Science and Technology); Moncef Gabbouj (Tampere University)
[P-ID 255] Lightweight Stochastic Video Prediction via Hybrid Warping
Kazuki Kotoyori (Waseda University)*; Shota Hirose (Waseda University); Heming Sun (Yokohama National University); Jiro Katto (Waseda University)
[P-ID 266] Energy-Efficient Video Streaming: A Study on Bit Depth and Color Subsampling
Hadi Amirpour (University of Klagenfurt)*; Lingfeng Qu (SWJTU); Jong Hwan Ko (Sungkyunkwan University); Cosmin Stejerean (Meta); Christian Timmerer (Alpen-Adria-Universität Klagenfurt)
[P-ID 279] Fast Retrieval of Pharmaceutical Packaging Images Using Keypoint Matching with Angle and Scale Voting for Outlier Rejection
Yona Zakaria (Nara Institute of Science and Technology)*; Rui Ishiyama (NEC); Eiki Ishidera (NEC); Tomokazu Matsui (Nara Institute of Science and Technology ); Keiichi Yasumoto (Nara Institute of Science and Technology, Japan)
[P-ID 217] Adaptive Hint Propagation for Iterative Stereo Matching
Anning Hu (Shanghai Jiao Tong University)*; Ang Li (Shanghai Jiao Tong University); Danping Zou (Shang Jiao Tong University)
[P-ID 224] AgeSynthGAN: Advanced Facial Age Synthesis with StyleGAN2
Tung-Ke Hsieh (National Chung Hsing University); Tsung-Jung Liu (National Chung Hsing University)*; Kuan-Hsien Liu (National Taichung University of Science and Technology)
[P-ID 241] LIGHTWEIGHT GRAPH CONVOLUTIONAL NETWORK BASED ON MULTI-HEAD RESIDUAL ATTENTION FOR HAND POINT CLASSIFICATION
Duc-Chinh Nguyen (International School, Vietnam National University); Manh-Hung Ha (International School, Vietnam National University)*; Manh-Tuan Do (International School, Vietnam National University); Oscal T.-C. Chen (National Chung Cheng University)
[P-ID 245] KonIQ-10k-LT: Overcoming Score Priors in Blind Image Quality Assessment Under Imbalanced Distributions
Desen Yuan (UESTC)*; Lei Wang (University of Electronic Science and Technology of China)
[P-ID 263] MSCFormer: Multi-Scale Circular Transformer for Image Deblurring
Shuai Wang (School of Microelectronics, Tianjin University)*; Han Wang (School of Microelectronics, Tianjin University); Renhe Liu ( School of Microelectronics, Tianjin University); Zhipeng Wu (School of Microelectronics, Tianjin University); Bo Wei (School of Engineering, The University of Tokyo); YU LIU (Tianjin University, Tianjin 300072, China)
[P-ID 280] LMM-driven Semantic Image-Text Coding for Ultra Low-bitrate Learned Image Compression
Shimon Murai (Waseda University)*; Heming Sun (Yokohama National University); Jiro Katto (Waseda University)
[15:30–17:00] Masaru Ibuka Auditorium
"3D"
Chair: Jianfeng Xu (KDDI Research, Inc.)
[P-ID 038] Adaptive Threshold Mask Prediction and Occlusion-aware Convolution for Foreground Occlusions in Light Fields
Jieyu Chen (Shanghai Universiy); Ping An (Shanghai University)*; Xinpeng Huang (Shanghai University); Chao Yang (Shanghai University)
[P-ID 229] Mirror-3DGS: Incorporating Mirror Reflections into 3D Gaussian Splatting
Jiarui Meng (Peking University)*; Haijie LI (Peking University); Yanmin Wu (Peking University); Qiankun Gao (Peking University Shenzhen Graduate School); Shuzhou Yang (Peking University); Jian Zhang (Peking University); Siwei Ma (Peking University, China)
[P-ID 070] Coarse-to-fine Transformer For Lossless 3D Medical Image Compression
Yang Xiaoxuan (Shanghai Jiao Tong University); Guo Lu (Shanghai Jiao Tong University)*; Donghui Feng (Cooperative Medianet Innovation Center, Shanghai, China); Zhengxue Cheng (Shanghai Jiao Tong University); Guosheng Yu (T-Head Semiconductor Co., Ltd, Alibaba Group); Li Song (Shanghai Jiao Tong University)
[P-ID 283] Efficient Camera Pose Adjustment to a Mirror Array for Structured Light Field Video Acquisition
Shunsuke Maeda (Tokyo University of Science)*; Kazuya KODAMA (Research Organization of Information and Systems); Takayuki Hamamoto (Tokyo University of Science)
[P-ID 099] Explicit-NeRF-QA: A Quality Assessment Database for Explicit NeRF Model Compression
Yuke Xing (Shanghai Jiao Tong University)*; Qi Yang (Tencent); Kaifa Yang (Shanghai Jiao Tong University); Yiling Xu (Shanghai Jiao Tong University); Zhu Li (university of missouri-kansas city)
[15:30–17:00] Conference Room 1
"Quality Assessment"
Chair: Yasuko Sugito (NHK)
[P-ID 103] Benchmarking Conventional and Learned Video Codecs with a Low-Delay Configuration
Siyue Teng (University of Bristol)*; Yuxuan Jiang (University of Bristol); Ge Gao (University of Bristol); Fan Zhang (University of Bristol); Thomas J Davies (Visionular); Zoe Liu (Visionular Inc); David Bull (University of bristol)
[P-ID 212] Image-Prompt Integration Network with Self-Ranking and Inter-Ranking Loss for AI-Generated Image Quality Assessment
Xizhang Yao (Shenzhen University)*; Tianwei Zhou (Shenzhen University); Guanghui Yue (Shenzhen university); songbai Tan (Shenzhen University); Xiaoying Ding (Zhongnan University of Economics and Law)
[P-ID 108] Perceptual Skin Tone Color Difference Measurement for Portrait Photography
Shiqi Gao (Shanghai Jiao Tong University)*; Huiyu Duan (Shanghai Jiao Tong University); Qihang Xu (Transsion); Jia Wang (Shanghai Jiao Tong University); Xiongkuo Min (Shanghai Jiao Tong University); Guangtao Zhai (Shanghai Jiao Tong University); Patrick Le Callet (Universite de Nantes, France)
[P-ID 223] Multi-Screen Effects on Quality Assessment: Investigating Banding Metrics Inconsistencies
Nickolay I Safonov (MSU)*; Dmitriy S Vatolin (Lomonosov Moscow State University); Dmitriy Kulikov (Lomonosov Moscow State University, Dubna University )
[P-ID 232] Judder Modelling Framework with Perceptual Quality Score Prediction for HDR Videos
Hongjie You (Technical University of Munich)*; Zhendong Li (TUM); Nicola Giuliani (TU Munich); Atanas Boev (Huawei Technologies Duesseldorf GmbH); Elena Alshina (Huawei Technologies); Eckehard Steinbach (TUM)