


default search action
IEEE Transactions on Parallel and Distributed Systems, Volume 36
Volume 36, Number 1, January 2025
- Junqiang Jiang

, Zhifang Sun
, Ruiqi Lu
, Li Pan
, Zebo Peng
:
Real Relative Encoding Genetic Algorithm for Workflow Scheduling in Heterogeneous Distributed Computing Systems. 1-14 - Sanaz Rabinia

, Niloofar Didar
, Marco Brocanelli
, Daniel Grosu
:
Algorithms for Data Sharing-Aware Task Allocation in Edge Computing Systems. 15-28 - Qiang He

, Guobiao Zhang, Jiawei Wang, Ruikun Luo
, Xiaohai Dai
, Yuchong Hu
, Feifei Chen
, Hai Jin
, Yun Yang
:
EdgeHydra: Fault-Tolerant Edge Data Distribution Based on Erasure Coding. 29-42 - Jonatha Anselmi

, Josu Doncel
:
Balanced Splitting: A Framework for Achieving Zero-Wait in the Multiserver-Job Model. 43-54 - Ruikun Luo

, Qiang He
, Feifei Chen
, Song Wu
, Hai Jin
, Yun Yang
:
Ripple: Enabling Decentralized Data Deduplication at the Edge. 55-66 - Haoyu Liao

, Tong-Yu Liu
, Jianmei Guo
, Bo Huang
, Dingyu Yang
, Jonathan Ding:
Retrospecting Available CPU Resources: SMT-Aware Scheduling to Prevent SLA Violations in Data Centers. 67-83 - Ruikun Luo

, Qiang He
, Mengxi Xu, Feifei Chen
, Song Wu
, Jing Yang
, Yuan Gao
, Hai Jin
:
Edge Data Deduplication Under Uncertainties: A Robust Optimization Approach. 84-95 - Guillaume Raffin

, Denis Trystram
:
Dissecting the Software-Based Measurement of CPU Energy Consumption: A Comparative Analysis. 96-107
Volume 36, Number 2, February 2025
- Hariharan Devarajan

, Gerd Heber
, Kathryn M. Mohror
:
H5Intent: Autotuning HDF5 With User Intent. 108-119 - Diletta Olliaro

, Adityo Anggraito
, Marco Ajmone Marsan
, Simonetta Balsamo
, Andrea Marin
:
The Impact of Service Demand Variability on Data Center Performance. 120-132 - Shuai Lin, Rui Wang

, Yongkun Li
, Yinlong Xu
, John C. S. Lui
:
Two-Dimensional Balanced Partitioning and Efficient Caching for Distributed Graph Analysis. 133-149 - Zhi Ling

, Xiaofeng Jiang
, Xiaobin Tan
, Huasen He
, Shiyin Zhu, Jian Yang
:
Joint Dynamic Data and Model Parallelism for Distributed Training of DNNs Over Heterogeneous Infrastructure. 150-167 - Diandian Gu

, Yihao Zhao
, Peng Sun, Xin Jin
, Xuanzhe Liu
:
GreenFlow: A Carbon-Efficient Scheduler for Deep Learning Workloads. 168-184 - Pengwei Wang

, Junye Qiao, Yuying Zhao, Zhijun Ding
:
Cost-Effective and Low-Latency Data Placement in Edge Environment Based on PageRank-Inspired Regional Value. 185-196 - Xiaodong Dong

, Lihai Nie
, Zheli Liu
, Yang Xiang
:
Slark: A Performance Robust Decentralized Inter-Datacenter Deadline-Aware Coflows Scheduling Framework With Local Information. 197-211 - Jialiang Han

, Yudong Han
, Xiang Jing, Gang Huang
, Yun Ma
:
DegaFL: Decentralized Gradient Aggregation for Cross-Silo Federated Learning. 212-225 - Zhongyi Lin

, Ning Sun, Pallab Bhattacharya
, Xizhou Feng, Louis Feng, John D. Owens
:
Towards Universal Performance Modeling for Machine Learning Training on Multi-GPU Platforms. 226-238 - Zhangrong Qin

, Xusheng Lu, Long Lv
, Zhongxiang Tang
, Binghai Wen
:
An Efficient GPU Algorithm for Lattice Boltzmann Method on Sparse Complex Geometries. 239-252 - J. Gregory Pauloski

, Valérie Hayot-Sasson
, Logan T. Ward
, Alexander Brace, André Bauer, Kyle Chard
, Ian T. Foster
:
Object Proxy Patterns for Accelerating Distributed Applications. 253-265 - Changyao Lin

, Zhenming Chen, Ziyang Zhang
, Jie Liu
:
TOP: Task-Based Operator Parallelism for Asynchronous Deep Learning Inference on GPU. 266-281 - Jing Hou

, Guang Chen
, Ruiqi Zhang
, Zhijun Li
, Shangding Gu, Changjun Jiang
:
Spreeze: High-Throughput Parallel Reinforcement Learning Framework. 282-292 - Guangyao Zhou

, Wenhong Tian
, Rajkumar Buyya
, Kui Wu
:
UMPIPE: Unequal Microbatches-Based Pipeline Parallelism for Deep Neural Network Training. 293-307 - Yuyang Jin

, Haojie Wang
, Xiongchao Tang, Zhenhua Guo, Yaqian Zhao, Torsten Hoefler
, Tao Liu
, Xu Liu, Jidong Zhai
:
Leveraging Graph Analysis to Pinpoint Root Causes of Scalability Issues for Parallel Applications. 308-325 - Giacomo Valente

, Gianluca Brilli
, Tania Di Mascio
, Alessandro Capotondi
, Paolo Burgio
, Paolo Valente
, Andrea Marongiu
:
Fine-Grained QoS Control via Tightly-Coupled Bandwidth Monitoring and Regulation for FPGA-Based Heterogeneous SoCs. 326-340
Volume 36, Number 3, March 2025
- Shuaibing Lu

, Ran Yan, Jie Wu
, Jackson Yang, Xinyu Deng, Shen Wu
, Zhi Cai
, Juan Fang
:
Online Elastic Resource Provisioning With QoS Guarantee in Container-Based Cloud Computing. 361-376 - Junyuan Liang

, Peiyuan Yao
, Wuhui Chen
, Zicong Hong
, Jianting Zhang
, Ting Cai
, Min Sun, Zibin Zheng
:
Sparrow: Expediting Smart Contract Execution for Blockchain Sharding via Inter-Shard Caching. 377-390 - Jialun Li

, Danyang Xiao
, Diying Yang
, Xuan Mo
, Weigang Wu
:
Integrated and Fungible Scheduling of Deep Learning Workloads Using Multi-Agent Reinforcement Learning. 391-406 - Saiman Dahal

, Pratyush Dhingra
, Krishu K. Thapa, Partha Pratim Pande
, Ananth Kalyanaraman
:
HpT: Hybrid Acceleration of Spatio-Temporal Attention Model Training on Heterogeneous Manycore Architectures. 407-421 - Yuhan Leng, Gaoyuan Zou, Hansheng Wang

, Panruo Wu
, Shaoshuai Zhang
:
High Performance Householder QR Factorization on Emerging GPU Architectures Using Tensor Cores. 422-436 - Lizhen Zhou

, Zichuan Xu
, Qiufen Xia
, Zhou Xu
, Wenhao Ren
, Wenbo Qi, Jinjing Ma, Song Yan, Yuan Yang:
Chasing Common Knowledge: Joint Large Model Selection and Pulling in MEC With Parameter Sharing. 437-454 - Binqi Sun

, Tomasz Kloda
, Jiyang Chen, Cen Lu, Marco Caccamo
:
Response Time Analysis and Optimal Priority Assignment for Global Non-Preemptive Fixed-Priority Rigid Gang Scheduling. 455-470 - Ziqu Yu

, Jinyu Gu
, Zijian Wu, Nian Liu, Jian Guo:
HTLL: Latency-Aware Scalable Blocking Mutex. 471-486 - Haining Yang

, Dengguo Feng, Jing Qin
:
Towards Efficient Verifiable Cloud Storage and Distribution for Large-Scale Data Streaming. 487-501 - Hongkuan Zhou

, Bingyi Zhang
, Rajgopal Kannan, Carl E. Busart
, Viktor K. Prasanna
:
ViTeGNN: Towards Versatile Inference of Temporal Graph Neural Networks on FPGA. 502-519 - Wenhan Xu

, Hui Ma
, Rui Zhang
, Jianhao Li
:
$ \mathsf{GPABE} $GPABE: GPU-Based Parallelization Framework for Attribute-Based Encryption Schemes. 520-536
Volume 36, Number 4, April 2025
- Junhee Ryu

, Dongeun Lee
, Kang G. Shin
, Kyungtae Kang
:
Paralfetch: Fast Application Launch on Personal Computing/Communication Devices. 616-632 - Yi Chen

, Qiang-Sheng Hua
, Zixiao Hong
, Lin Zhu, Hai Jin
:
FHE4DMM: A Low-Latency Distributed Matrix Multiplication With Fully Homomorphic Encryption. 645-658 - Zhengjun Cao

:
A Note on "AESM2 Attribute-Based Encrypted Search for Multi-Owner and Multi-User Distributed Systems". 675-676 - Yan Zeng

, Chengchuang Huang
, Yipeng Mei
, Lifu Zhang
, Teng Su
, Wei Ye
, Wenqi Shi, Shengnan Wang
:
EfficientMoE: Optimizing Mixture-of-Experts Model Training With Adaptive Load Balance. 677-688 - Luiz Gustavo Coutinho Xavier

, Cristina Meinhardt
, Odorico Machado Mendizabal
:
Beelog: Online Log Compaction for Dependable Systems. 689-700
Volume 36, Number 5, May 2025
- Omer F. Rana

, Josef Spillner
, Stephen Leak, Gerald F. Lofstead II, Rafael Tolosana-Calasanz
:
Guest Editorial:Special Section on SC22 Student Cluster Competition. 803 - Alexandros Nikolaos Ziogas

, Timo Schneider, Tal Ben-Nun, Alexandru Calotoiu, Tiziano De Matteis
, Johannes de Fine Licht, Luca Lavarini, Torsten Hoefler:
Productivity, Portability, Performance, and Reproducibility: Data-Centric Python. 804-820 - Fu-Chiang Chang

, En-Ming Huang
, Pin-Yi Kuo, Chan-Yu Mou
, Hsu-Tzu Ting, Pang-Ning Wu, Jerry Chou
:
Reproducing Performance of Data-Centric Python by SCC Team From National Tsing Hua University. 821-825 - Zihan Yang

, Yi Chen
, Kaiqi Chen
, Xingjian Qian
, Shaojun Xu
, Yun Pan
, Chong Zeng
, Jianhai Chen
, Yin Zhang
, Zeke Wang
:
Critique of "Productivity, Portability, Performance: Data-Centric Python" by SCC Team From Zhejiang University. 826-829 - Han Huang

, Tengyang Zheng
, Tianxing Yang
, Yang Ye
, Siran Liu
, Zhe Tang
, Shengyou Lu
, Guangnan Feng
, Zhiguang Chen
, Dan Huang
:
Critique of "Productivity, Portability, Performance Data-Centric Python" by SCC Team From Sun Yat-sen University. 830-834 - Christopher Lompa

, Piotr Luczynski
:
Analysis and Reproducibility of "Productivity, Portability, Performance: Data-Centric Python". 835-840 - Anish Govind

, Yuchen Jing
, Stefanie Dao
, Michael Granado
, Rachel Handran
, Davit Margarian
, Matthew Mikhailov
, Danny Vo
, Matei-Alexandru Gardus
, Khai Vu
, Derek Bouius, Bryan Chin
, Mahidhar Tatineni
, Mary P. Thomas
:
Reproducibility of the DaCe Framework on NPBench Benchmarks. 841-846 - Yuan Gao

, Liquan Chen
, Jianchang Lai
, Tianyi Wang
, Xiaoming Wu, Shui Yu
:
IoT-Dedup: Device Relationship-Based IoT Data Deduplication Scheme. 847-860 - Conor John Williams

, James Elliott:
Libfork: Portable Continuation-Stealing With Stackless Coroutines. 877-888 - Keyun Cheng

, Huancheng Puyang, Xiaolu Li
, Patrick P. C. Lee
, Yuchong Hu
, Jie Li
, Ting-Yi Wu:
Toward Load-Balanced Redundancy Transitioning for Erasure-Coded Storage. 889-902 - Junhan Liu

, Zinuo Cai
, Yumou Liu, Hao Li
, Zongpu Zhang
, Ruhui Ma
, Rajkumar Buyya
:
SMore: Enhancing GPU Utilization in Deep Learning Clusters by Serverless-Based Co-Location Scheduling. 903-917 - Hyeonjin Kim

, Taesoo Lim
, William J. Song
:
Graphite: Hardware-Aware GNN Reshaping for Acceleration With GPU Tensor Cores. 918-931 - S. M. Shovan

, Arindam Khanda
, Sajal K. Das
:
Parallel Multi Objective Shortest Path Update Algorithm in Large Dynamic Networks. 932-944 - Xiangyu Zou

, Wen Xia
, Philip Shilane
, Haijun Zhang
, Xuan Wang
:
The Design of a High-Performance Fine-Grained Deduplication Framework for Backup Storage. 945-960 - Qiange Wang

, Xin Ai, Yongze Yan
, Shufeng Gong
, Yanfeng Zhang
, Jing Chen
, Ge Yu
:
Towards Communication-Efficient Out-of-Core Graph Processing on the GPU. 961-976 - Huijing Yang

, Juan Fang
, Yumin Hou
, Xing Su
, Neal N. Xiong
:
Reinforcement Learning-Driven Adaptive Prefetch Aggressiveness Control for Enhanced Performance in Parallel System Architectures. 977-993 - Zerui Shao

, Beibei Li
, Peiran Wang
, Yi Zhang
, Kim-Kwang Raymond Choo
:
FedLoRE: Communication-Efficient and Personalized Edge Intelligence Framework via Federated Low-Rank Estimation. 994-1010 - Jingweijia Tan

, Xurui Li, An Zhong
, Kaige Yan
, Xiaohui Wei
, Guanpeng Li
:
GEREM: Fast and Precise Error Resilience Assessment for GPU Microarchitectures. 1011-1024 - Jan Laukemann

, Ahmed E. Helal
, S. Isaac Geronimo Anderson, Fabio Checconi, Yongseok Soh, Jesmin Jahan Tithi
, Teresa M. Ranadive
, Brian J. Gravelle, Fabrizio Petrini, Jee W. Choi:
Accelerating Sparse Tensor Decomposition Using Adaptive Linearized Representation. 1025-1041 - Weihan Kong

, Shengan Zheng
, Yifan Hua
, Ruoyan Ma, Yuheng Wen
, Guifeng Wang
, Cong Zhou
, Linpeng Huang
:
PimBeam: Efficient Regular Path Queries Over Graph Database Using Processing-in-Memory. 1042-1057 - Zhaochen Zhang

, Xu Zhang
, Zhaoxiang Bao
, Liang Wei, Chaohong Tan, Wanchun Dou
, Guihai Chen
, Chen Tian
:
Courier: A Unified Communication Agent to Support Concurrent Flow Scheduling in Cluster Computing. 861-876
Volume 36, Number 6, June 2025
- Yuan Yao

, Yujiao Hu
, Yi Dang, Wei Tao, Kai Hu, Qiming Huang, Zhe Peng
, Gang Yang
, Xingshe Zhou:
Workload-Aware Performance Model Based Soft Preemptive Real-Time Scheduling for Neural Processing Units. 1058-1070 - Wei Gao

, Zhuoyuan Ouyang
, Peng Sun
, Tianwei Zhang
, Yonggang Wen
:
IceFrog: A Layer-Elastic Scheduling System for Deep Learning Training in GPU Clusters. 1071-1086 - Wenting Wei

, Huaxi Gu
, Zhe Xiao, Yi Chen:
Energy Efficient and Multi-Resource Optimization for Virtual Machine Placement by Improving MOEA/D. 1087-1099 - Wenming Li

, Zhihua Fan
, Tianyu Liu
, Zhen Wang
, Haibin Wu
, Meng Wu
, Kunming Zhang, Yanhuan Liu
, Ninghui Sun
, Xiaochun Ye
, Dongrui Fan
:
DFU-E: A Dataflow Architecture for Edge DSP and AI Applications. 1100-1114 - Yifeng Tang

, Huaman Zhou
, Zhuoran Ji
, Cho-Li Wang:
Cube-fx: Mapping Taylor Expansion Onto Matrix Multiplier-Accumulators of Huawei Ascend AI Processors. 1115-1129 - William Andrew Simon

, Irem Boybat
, Riselda Kodra
, Elena Ferro
, Gagandeep Singh
, Mohammed Alser
, Shubham Jain
, Hsinyu Tsai
, Geoffrey W. Burr
, Onur Mutlu
, Abu Sebastian
:
CiMBA: Accelerating Genome Sequencing Through On-Device Basecalling via Compute-in-Memory. 1130-1145 - Wei Zhang

, Yunlong Yu
, Xiao Jiang, Nan Guan
, Naijun Zhan
, Lei Ju
:
WCET Estimation for CNN Inference on FPGA SoC With Multi-DPU Engines. 1146-1160 - Haobin Tan

, Yao Xiao
, Amelie Chi Zhou
, Kezhong Lu
, Xuan Yang
:
Distributed and Adaptive Partitioning for Large Graphs in Geo-Distributed Data Centers. 1161-1174 - Kumseok Jung

, Julien Gascon-Samson
, Sathish Gopalakrishnan
, Karthik Pattabiraman
:
OneOS: Distributed Operating System for the Edge-to-Cloud Continuum. 1175-1192 - Luca Colagrande

, Luca Benini
:
Taming Offload Overheads in a Massively Parallel Open-Source RISC-V MPSoC: Analysis and Optimization. 1193-1205 - Somesh Singh

, Bora Uçar
:
Efficient Parallel Sparse Tensor Contraction. 1206-1219 - Guohua Xin

, Guangquan Xu
, Yao Zhang
, Cheng Wen, Cen Zhang
, Xiaofei Xie
, Neal N. Xiong
, Shaoying Liu
, Pan Gao:
IRHunter: Universal Detection of Instruction Reordering Vulnerabilities for Enhanced Concurrency in Distributed and Parallel Systems. 1220-1236 - Thomas W. Pusztai

, Stefan Nastic
:
ChunkFunc: Dynamic SLO-Aware Configuration of Serverless Functions. 1237-1252 - Sahil Tyagi

, Prateek Sharma
:
OmniLearn: A Framework for Distributed Deep Learning Over Heterogeneous Clusters. 1253-1267 - Zhengyu Liao

, Shiyou Qian
, Zhonglong Zheng
, Jian Cao
, Guangtao Xue
, Minglu Li
:
$AWB^+$AWB+-$Tree$Tree: A Novel Width-Based Index Structure Supporting Hybrid Matching for Large-Scale Content-Based Pub/Sub Systems. 1268-1281 - Huazhong Lü

, Kai Deng, Xiaomei Yang
:
Symmetric Properties and Two Variants of Shuffle-Cubes. 1282-1293 - Xiangyu Kong

, Yi Huang
, Longlong Chen
, Jianfeng Zhu
, Liangwei Li
, Xingchen Man
, Mingyu Gao
, Shaojun Wei
, Leibo Liu
:
Raccoon: Lightweight Support for Comprehensive Control Flows in Reconfigurable Spatial Architectures. 1294-1310 - Laleh Ghalami, Daniel Grosu

:
Parallel Greedy Algorithms for Steiner Forest. 1311-1325 - Yuxia Cheng

, Linfeng Xu, Tongkai Yang, Wei Wu, Zhiqiang Lin, Antong Yu, Wenzhi Chen
:
Beehive: Decentralised High-Frequency Small Tasks Scheduling in Large Clusters. 1326-1337 - Xue Jiang

, Hengfeng Wei
, Yu Huang
, Yuxing Chen
, Anqun Pan
:
A Generic Specification Framework for Weakly Consistent Replicated Data Types. 1338-1353
Volume 36, Number 7, July 2025
- Hui Dou

, Mingjie He
, Lei Zhang
, Yiwen Zhang
, Zibin Zheng
:
CausalConf: Datasize-Aware Configuration Auto-Tuning for Recurring Big Data Processing Jobs via Adaptive Causal Structure Learning. 1354-1371 - Zhaorui Zhang

, Sheng Di
, Kai Zhao
, Sian Jin
, Dingwen Tao
, Zhuoran Ji
, Benben Liu
, Khalid Ayedh Alharthi
, Jiannong Cao
, Franck Cappello:
FedCSpc: A Cross-Silo Federated Learning System With Error-Bounded Lossy Parameter Compression. 1372-1386 - Zhibo Xuan

, Xin Sun
, Xin You
, Hailong Yang
, Zhongzhi Luan
, Yi Liu
, Depei Qian
:
Identifying Performance Inefficiencies of Parallel Program With Spatial and Temporal Trace Analysis. 1387-1400 - Yuan Meng

, Mahesh A. Iyer, Viktor K. Prasanna
:
An Acceleration Framework for Deep Reinforcement Learning Using Heterogeneous Systems. 1401-1415 - Qiang Wang

, Zhicheng Li, Fucai Zhou
, Jian Xu
, Changsheng Zhang
:
Publicly Verifiable Distributed Computation for MEC Setting. 1416-1430 - Yangjun Wu, Wanlu Cao, Jiacheng Zhao, Honghui Shang

:
Fast and Scalable Neural Network Quantum States Method for Molecular Potential Energy Surfaces. 1431-1443 - Jiajian Zhang

, Fangyu Wu
, Hai Jiang
, Qiufeng Wang
, Genlang Chen
, Guangliang Cheng
, Eng Gee Lim
, Keqin Li
:
AlignMalloc: Warp-Aware Memory Rearrangement Aligned With UVM Prefetching for Large-Scale GPU Dynamic Allocations. 1444-1459 - Na Wang

, Kaifa Zheng
, Wen Zhou
, Jianwei Liu
, Lunzhi Deng
, Junsong Fu
:
A Lightweight and Fine-Grained Ciphertext Search Scheme for Big Data Assisted by Proxy Servers. 1460-1477 - Wanqi Yang

, Pengfei Chen
, Kai Liu, Huxing Zhang
:
ZeroTracer: In-Band eBPF-Based Trace Generator With Zero Instrumentation for Microservice Systems. 1478-1494 - Qingcai Jiang

, Zhenwei Cao, Junshi Chen
, Xinming Qin, Wei Hu
, Hong An
, Jinlong Yang
:
PWDFT-SW: Extending the Limit of Plane-Wave DFT Calculations to 16K Atoms on the New Sunway Supercomputer. 1495-1508 - Kechang Yang

, Biao Hu
, Mingguo Zhao
:
Coordinating Computational Capacity for Adaptive Federated Learning in Heterogeneous Edge Computing Systems. 1509-1523 - Fangyu Zheng

, Guang Fan
, Wenxu Tang
, Yixuan Song
, Tian Zhou
, Yuan Zhao
, Jiankuo Dong
, Jingqiang Lin
, Shoumeng Yan
, Jiwu Jing
:
GIF-FHE: A Comprehensive Implementation and Evaluation of GPU-Accelerated FHE With Integer and Floating-Point Computing Power. 1524-1541 - Pengmiao Zhang

, Rajgopal Kannan
, Viktor K. Prasanna
:
GraFetch: Accelerating Graph Applications Through Domain Specific Hierarchical Hybrid Prefetching. 1542-1559 - He Zhu

, Mingyu Li, Haihang You
:
RHINO: An Efficient Serverless Container System for Small-Scale HPC Applications. 1560-1573 - Wenhao Lu

, Zhiyuan Wang
, Hefan Zhang
, Shan Zhang
, Hongbin Luo
:
OpenSN: An Open Source Library for Emulating LEO Satellite Networks. 1574-1590 - Zijie Liu

, Yi Cheng
, Can Chen, Jun Hu, Rongguo Fu
, Dengyin Zhang
:
ISACPP: Interference-Aware Scheduling Approach for Deep Learning Training Workloads Based on Co-Location Performance Prediction. 1591-1607 - Mimi Qian

, Lin Cui
, Xiaoquan Zhang
, Fung Po Tso
, Yuhui Deng
, Zhetao Li
, Weijia Jia
:
DisPLOY: Target-Constrained Distributed Deployment for Network Measurement Tasks on Data Plane. 1608-1619 - Stefan Popa

, Vlad Petric
, Mihai Ivanovici
:
A Highly-Parallel and Scalable Hardware Accelerator for the NTest Othello Game Engine. 1620-1633 - Xuyang Liu

, Zijian Zhang
, Zhen Li, Hao Yin
, Meng Li
, Jiamou Liu
, Mauro Conti
, Liehuang Zhu
:
ABSE: Adaptive Baseline Score-Based Election for Leader-Based BFT Systems. 1634-1650 - Xingguo Pang

, Liu Liu
, Yanze Zhang, Zhuofu Chen
, Zhijun Ding
, Dazhao Cheng
, Xiaobo Zhou
:
Featherlight Stateful WebAssembly for Serverless Inference Workflows. 1651-1665 - Guangyao Zhou

, Yiqin Fu, Haocheng Lan, Yuanlun Xie, Wenhong Tian
, Rajkumar Buyya
, Jianhong Qian, Teng Su
:
Cross-Search With Improved Multi-Dimensional Dichotomy-Based Joint Optimization for Distributed Parallel Training of DNN. 1680-1694 - Hao Wu

, Shiyi Wang
, Youhui Bai
, Cheng Li
, Quan Zhou
, Jun Yi
, Feng Yan
, Ruichuan Chen
, Yinlong Xu
:
A Generic, High-Performance, Compression-Aware Framework for Data Parallel DNN Training. 1695-1712
Volume 36, Number 8, August 2025
- Mohamed Yassine Boukhari

, Akash Balasaheb Dhasade, Anne-Marie Kermarrec
, Rafael Pires
, Othmane Safsafi, Rishi Sharma
:
Boosting Resource-Constrained Federated Learning Systems With Guessed Updates. 1666-1679 - Kohei Yoshida

, Ryuichi Sakamoto
, Kento Sato
, Abhinav Bhatele
, Hayato Yamaki
, Hiroki Honda
, Shinobu Miwa
:
VAHRM: Variation-Aware Resource Management in Heterogeneous Supercomputing Systems. 1713-1727 - Yuhui Zhang

, Hong Liao, Lutan Zhao
, Yuncong Shao, Zhihong Tian
, XiaoFeng Wang, Dan Meng, Rui Hou
:
An Efficient Speculative Federated Tree Learning System With a Lightweight NN-Based Predictor. 1728-1743 - Rui Zhao

, Kui Wang
, Yun Li
, Yuze Fan
, Fei Gao
, Zhenhai Gao
:
Safe Multi-Agent Deep Reinforcement Learning for the Management of Autonomous Connected Vehicles at Future Intersections. 1744-1761 - Lin Qiu

, Xing-Wei Wang
, Bo Yi, Kaimin Zhang, Fei Gao
, Min Huang
, Yanpeng Qu
:
Towards Efficiency and Decentralization: A Blockchain Assisted Distributed Fuzzy-Rough Feature Selection. 1762-1778 - Yuanming Zhang

, Pinghui Wang
, Kuankuan Cheng, Junzhou Zhao
, Jing Tao, Jingxin Hai, Junlan Feng, Chao Deng, Xidian Wang:
Building Accurate and Interpretable Online Classifiers on Edge Devices. 1779-1796 - Yaning Yang

, Xiaoqi Wang, Chengqing Li
, Shaoliang Peng
:
Parallel Acceleration of Genome Variation Detection on Multi-Zone Heterogeneous System. 1797-1809 - Weiyi Sun

, Jianfeng Zhu
, Mingyu Gao
, Zhaoshi Li, Shaojun Wei
, Leibo Liu
:
SSS-DIMM: Removing Redundant Data Movement in Trusted DIMM-Based Near-Memory-Processing Kernel Offloading via Secure Space Sharing. 1810-1827
Volume 36, Number 9, September 2025
- Shengwei Li

, Zhiquan Lai
, Dongsheng Li
, Yanqi Hao, Weijie Liu
, Keshi Ge
, Xiaoge Deng
, Kai Lu
:
Oases: Efficient Large-Scale Model Training on Commodity Servers via Overlapped and Automated Tensor Model Parallelism. 1828-1840 - Zhaoyang Xie

, Haibin Zhang
, Sisi Duan
, Chao Liu
, Shengli Liu
, Xuanji Meng, Yong Yu
, Fangguo Zhang
, Boxin Zhao, Liehuang Zhu
, Tianqing Zhu
:
Everything Distributed and Asynchronous: A Practical System for Key Management Service. 1841-1856 - Adrian C. Rublein

, Fidan Mehmeti
, Mark Mahon
, Thomas F. La Porta
:
Improved Methods of Task Assignment and Resource Allocation With Preemption in Edge Computing Systems. 1857-1871 - Fanxin Li

, Shixiong Zhao
, Yuhao Qing
, Jianyu Jiang
, Xusheng Chen
, Heming Cui
:
PipeMesh: Achieving Memory-Efficient Computation-Communication Overlap for Training Large Language Models. 1872-1889 - Long Yuan

, Zeyu Zhou, Zi Chen
, Xuemin Lin
, Xiang Zhao
, Fan Zhang
:
$\text {GPUSCAN}^{++}$: Efficient Structural Graph Clustering on GPUs. 1890-1903 - Yang Bai

, Mingjun Li
, Wendong Xu
, Bei Yu
:
A Learned Performance Model With Transfer Learning Across GPUs on Tensorized Instructions. 1904-1919 - Hao Zhou

, Yuanhui Chen, Wu Zeng, Lixiao Cui
, Gang Wang
, Xiaoguang Liu
:
GPComp: Using GPU and SSD-GPU Peer to Peer DMA to Accelerate LSM-Tree Compaction for Key-Value Store. 1920-1936 - Laura Carnevali

, Marco Paolieri
, Riccardo Reali
, Leonardo Scommegna
, Enrico Vicario
:
Compositional Coordinated Resource Provisioning in Workflows With Stochastic Durations. 1937-1954 - Xiaoyong Yan

, Fu Xiao
, Jian Zhou
, Xiulong Liu
, Chuntao Ding
, Jiannong Cao
, Aiguo Song
, Alex X. Liu
:
NDP: Network Division Positioning for Irregular Multi-Hop Networks. 1955-1971 - Ruidong Zhu

, Ziyue Jiang
, Zhi Zhang, Xin Liu, Xuanzhe Liu
, Xin Jin
:
Cannikin: No Lagger of SLO in Concurrent Multiple LoRA LLM Serving. 1972-1984 - Shengle Lin

, Guoqing Xiao
, Haotian Wang
, Wangdong Yang
, Kenli Li
, Keqin Li
:
High Performance OpenCL-Based GEMM Kernel Auto-Tuned by Bayesian Optimization. 1985-1997 - Yinuo Wang

, Zeyu Song, Wubing Wan
, Xinpeng Zhao
, Lin Gan
, Ping Gao
, Wenqiang Wang, Zhenguo Zhang
, Haohuan Fu
, Wei Xue
, Guangwen Yang
:
Accelerating Half-Precision Seismic Simulation on Neural Processing Unit. 1998-2013 - Kaiyuan Liu

, Xiaobo Zhou
, Li Li
:
m$^{2}$2LLM: A Multi-Dimensional Optimization Framework for LLM Inference on Mobile Devices. 2014-2029 - Oleksandr O. Sudakov

, Volodymyr L. Maistrenko
:
Parallelization of Network Dynamics Computations in Heterogeneous Distributed Environment. 2030-2044 - Giulio Malenza

, Valentina Cesare
, Marco Edoardo Santimaria
, Robert Birke
, Alberto Vecchiato
, Ugo Becciani
, Marco Aldinucci
:
Performance Portability Assessment in Gaia. 2045-2057
Volume 36, Number 10, October 2025
- Weimin Li

, Qin Li
, Weihong Tian
, Jie Gao
, Fan Wu
, Jianxun Liu
, Ju Ren
:
MUCVR: Edge Computing-Enabled High-Quality Multi-User Collaboration for Interactive MVR. 2058-2072 - Yujian Wu

, Shanjiang Tang
, Ce Yu
, Bin Yang
, Chao Sun
, Jian Xiao
, Hutong Wu, Jinghua Feng:
Task Scheduling in Geo-Distributed Computing: A Survey. 2073-2088 - Bin Deng

, Weidong Li
:
Dynamic Multiresource Fair Allocation With Time Discount Utility. 2089-2103 - Yepeng Zhang

, Haitao Zhang
, Huadong Ma
:
RL-Based Hybrid CPU Scaling for Soft Deadline Constrained Tasks in Container Clouds. 2104-2118 - Yishan Chen

, Xiangwei Zeng
, Huashuai Cai, Qing Xu, Zhiquan Liu
:
Decentralized QoS-Aware Model Inference Using Federated Split Learning for Cloud-Edge Medical Detection. 2119-2136 - Xing Wei

, Ke Wang
, Yinjun Han, Hao Jin, Yaofeng Tu
, Huiqi Hu
, Xuan Zhou, Minghao Zhao:
Mariana: Exploring Native SkipList Index Design for Disaggregated Memory. 2137-2151 - Mostafa Kishani

, Sina Ahmadi
, Saba Ahmadian
, Reza Salkhordeh
, Zdenek Becvar
, Onur Mutlu
, André Brinkmann
, Hossein Asadi
:
ELICA: Efficient and Load Balanced I/O Cache Architecture for Hyperconverged Infrastructures. 2152-2168
Volume 36, Number 11, November 2025
- Menghao Guo, Longlong Chen, Yichi Zhang, Hongyi Guan, Shaojun Wei, Jianfeng Zhu, Leibo Liu:

Exploiting Fine-Grained Task-Level Parallelism for Variant Calling Acceleration. 2169-2181 - Tianchan Guan, Yijin Guan, Zhaoyang Du, Jiacheng Ma, Boyu Tian, Zhao Wang, Teng Ma, Zheng Liu, Yang Kong, Yuan Xie, Mingyu Gao, Guangyu Sun, Hongzhong Zheng, Dimin Niu:

MemTunnel: A CXL-Based Rack-Scale Host Memory Pooling Architecture for Cloud Service. 2182-2197 - Wang Zhang, Yuyang Zhu, Zhan Shi, Manyu Dang, Yutong Wu, Fang Wang, Dan Feng:

A Sparse Function Prediction Approach for Cold Start Optimization and User Satisfaction Guarantee in Serverless. 2198-2213 - Yufei Yang, Chenhao Xie, Liansheng Liu, Xiyuan Peng, Yu Peng, Hailong Yang, Depei Qian:

PreTrans: Enabling Efficient CGRA Multi-Task Context Switch Through Config Pre-Mapping and Data Transceiving. 2214-2228 - Hancheng Wang, Haipeng Dai, Shusen Chen, Meng Li, Rong Gu, Youyou Lu, Chengxun Wu, Jiaqi Zheng, Lexi Xu, Guihai Chen:

Parallel Wormhole Filters: High-Performance Approximate Membership Query Data Structures for Persistent Memory. 2229-2246 - Xishuo Li, Shan Zhang, Tie Ma, Zhiyuan Wang, Hongbin Luo:

Doing More With Less: Balancing Probing Costs and Task Offloading Efficiency At the Network Edge. 2247-2263 - Zhiwei Wang, Haoqi He, Lutan Zhao, Peinan Li, Zhihao Li, Dan Meng, Rui Hou:

Chameleon: An Efficient FHE Scheme Switching Acceleration on GPUs. 2264-2280 - Bohuai Xiao, Chujia Yu, Xing Chen, Zheyi Chen, Geyong Min:

Multi-Agent Collaboration for Workflow Task Offloading in End-Edge-Cloud Environments Using Deep Reinforcement Learning. 2281-2296 - Huijun Wang, Oliver Sinnen:

Scheduling Fork-Joins With Communication Delays and Equal Processing Times on Heterogeneous Processors. 2297-2309 - Xingzi Yu, Xingguo Jia, Jin Zhang, Yun Wang, Senhao Yu, Zhengwei Qi:

Rethinking Virtual Machines Live Migration for Memory Disaggregation. 2310-2324 - Ouwen Jin, Qinghui Xing, Zhuo Chen, Ming Zhang, De Ma, Ying Li, Xin Du, Shuibing He, Shuiguang Deng, Gang Pan:

Mapping Large-Scale Spiking Neural Network on Arbitrary Meshed Neuromorphic Hardware. 2325-2340 - Yiqin Dai, Ruibo Wang, Yong Dong, Min Xie, Juan Chen, Wenzhe Zhang, Huijun Wu, Mingtian Shao, Kai Lu:

MIST: Towards MPI Instant Startup and Termination on Tianhe HPC Systems. 2341-2353 - Jie Gao, Jia Hu, Geyong Min, Fei Hao:

XDGNN: Efficient Distributed GNN Training via Explanation-Guided Subgraph Expansion. 2354-2365 - Zhengyi Yuan, Xiong Wang, Yuntao Nie, Yufei Tao, Yuqing Li, Zhiyuan Shao, Xiaofei Liao, Bo Li, Hai Jin:

DynPipe: Toward Dynamic End-to-End Pipeline Parallelism for Interference-Aware DNN Training. 2366-2382 - Mengsi He, Zhongming Fu, Zhuo Tang:

Optimizing Data Locality by Integrating Intermediate Data Partitioning and Reduce Task Scheduling in Spark Framework. 2383-2398 - Mathias Oliveira, Willian Barreiros, Renato Ferreira, Alba C. M. A. Melo, George Teodoro:

The Megapixel Approach for Efficient Execution of Irregular Wavefront Algorithms on GPUs. 2399-2411 - Babar Ali, Muhammed Golec, Sukhpal Singh Gill, Félix Cuadrado, Steve Uhlig:

EdgeAIBus: AI-Driven Joint Container Management and Model Selection Framework for Heterogeneous Edge Computing. 2412-2424 - Matthew Weidner, Martin Kleppmann:

The Art of the Fugue: Minimizing Interleaving in Collaborative Text Editing. 2425-2437 - Kyrian Adimora, Hongyang Sun:

HARMONIC: Uncertainty-Aware Multi-Objective Optimization for Energy-Efficient HPC Resource Management. 2438-2450

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














