


Остановите войну!
for scientists:


default search action
ICPP 2023: Salt Lake City, UT, USA
- Proceedings of the 52nd International Conference on Parallel Processing, ICPP 2023, Salt Lake City, UT, USA, August 7-10, 2023. ACM 2023
Numerics (In-Person)
- Sameer Deshmukh
, Rio Yokota
, George Bosilca
, Qianxiang Ma
:
O(N) distributed direct factorization of structured dense matrices using runtime systems. 1-10
Optimization of AI/ML (In Person)
- Georgia Channing
, Ria Patel
, Paula Olaya
, Ariel Keller Rorabaugh
, Osamu Miyashita
, Silvina Caíno-Lores
, Catherine D. Schuman
, Florence Tama
, Michela Taufer
:
Composable Workflow for Accelerating Neural Architecture Search Using In Situ Analytics for Protein Classification. 1
Numerics (In-Person)
- M. Ridwan Apriansyah
, Rio Yokota
:
Computing the k-th Eigenvalue of Symmetric H2-Matrices. 11-20 - Junqing Lin
, Honghe Zhang
, Xiaolong Shi
, Jingwei Sun
, Xianzhi Yu
, Jun Yao
, Guangzhong Sun
:
EC-SpMM: Efficient Compilation of SpMM Kernel on GPUs. 21-30
Compression and Encoding (In Person)
- Fangzheng Lin
, Kasidis Arunruangsirilert
, Heming Sun
, Jiro Katto
:
Recoil: Parallel rANS Decoding with Decoder-Adaptive Scalability. 31-40 - Mi Zhang
, Qihan Kang
, Patrick P. C. Lee
:
Minimizing Network and Storage Costs for Consensus with Flexible Erasure Coding. 41-50 - Shui Jiang
, Tsung-Wei Huang
, Bei Yu
, Tsung-Yi Ho
:
SNICIT: Accelerating Sparse Neural Network Inference via Compression at Inference Time on GPU. 51-61
AI/ML Performance (Remote Session)
- Lixiao Cui
, Kedi Yang
, Yusen Li
, Gang Wang
, Xiaoguang Liu
:
DiffLex: A High-Performance, Memory-Efficient and NUMA-Aware Learned Index using Differentiated Management. 62-71 - Hesheng Sun
, Xinyi Chen
, Zhuzhong Qian
, Zengji Li
, Ning Chen
, Tuo Cao
, Suwei Xu
, Yitong Zhou
:
BIRP: Batch-aware Inference Workload Redistribution and Parallel Scheme for Edge Collaboration. 72-81 - Yongwen Qiu
, Yongmei Lei
, Guozheng Wang
:
PSRA-HGADMM: A Communication Efficient Distributed ADMM Algorithm. 82-91 - Zhenxing Li
, Qiang Cao
, Yajie Chen
, Wenrui Yan
:
CoTrain: Efficient Scheduling for Large-Model Training upon GPU and CPU in Parallel. 92-101 - Zixuan Chen
, Lei Shi
, Xuandong Liu
, Jiahui Li
, Sen Liu
, Yang Xu
:
OSP: Boosting Distributed Model Training with 2-stage Synchronization. 102-111 - Yuning Zhang
, Zao Zhang
, Wei Bao
, Dong Yuan
:
ITIF: Integrated Transformers Inference Framework for Multiple Tenants on GPU. 112-121
Graph Algorithms (In Person)
- Bin Guo
, Emil Sekerinski
:
Parallel Order-Based Core Maintenance in Dynamic Graphs. 122-131 - Md Abdul Motaleb Faysal
, Maximilian H. Bremer
, Cy P. Chan
, John Shalf
, Shaikh Arifuzzaman
:
Fast Parallel Index Construction for Efficient K-truss-based Local Community Detection in Large Graphs. 132-141 - Samiran Kawtikwar
, Mohammad Almasri
, Wen-Mei Hwu
, Rakesh Nagi
, Jinjun Xiong
:
BEEP: Balanced Efficient subgraph Enumeration in Parallel. 142-152
Programming Models (In Person)
- Omri Mor
, George Bosilca
, Marc Snir
:
Improving the Scaling of an Asynchronous Many-Task Runtime with a Lightweight Communication Engine. 153-162 - Romain Pereira
, Adrien Roussel
, Patrick Carribault
, Thierry Gautier
:
Investigating Dependency Graph Discovery Impact on Task-based MPI+OpenMP Applications Performances. 163-172 - Eric Wright
, Johannes Doerfert
, Shilei Tian
, Barbara M. Chapman
, Sunita Chandrasekaran
:
Implementing OpenMP's SIMD Directive in LLVM's GPU Runtime. 173-182
Applications (Remote Session)
- Peng Wang
, Yu Liu
, Zhelong Zhao
, Ke Zhou
, Zhihai Huang
, Yanxiong Chen
:
Smart Cache Insertion and Promotion Policy for Content Delivery Networks. 183-192 - Haowen Zhang
, Jing Li
, He Zhao
, Tong Zhou
, Nianzu Sheng
, Hengyu Pan
:
BlockPilot: A Proposer-Validator Parallel Execution Framework for Blockchain. 193-202 - Chenyang Jiao
, Weihua Zhang
, Li Shen
:
Communication Optimizations for State-vector Quantum Simulator on CPU+GPU Clusters. 203-212
LMS-Tree Research (Remote Session)
- Zepeng Wang
, Shu Yin
:
RBC: A bandwidth controller to reduce write-stalls and tail latency. 213-222 - Ziyi Lu
, Qiang Cao
, Shucheng Wang
, Jie Yao
, Xiangrui Yang
:
PMLDS: An LSM-Tree Direct Managed Storage for Key-Value Stores on Byte-Addressable Devices. 223-232 - Chen Ding
, Jian Zhou
, Jiguang Wan
, Yiqin Xiong
, Sicen Li
, Shuning Chen
, Hanyang Liu
, Liu Tang
, Ling Zhan
, Kai Lu
, Peng Xu
:
DComp: Efficient Offload of LSM-tree Compaction with Data Processing Units. 233-243
Applications (Remote Session, Part II)
- Jiali Li
, Xianzhang Chen
, Duo Liu
, Ao Ren
, Zhaoyang Zeng
, Yujuan Tan
:
RadarSSD: A Computational Storage for Radar Signal Processing. 244-253
Training (In Person)
- Sixu Hu
, Qinbin Li
, Bingsheng He
:
Communication-Efficient Generalized Neuron Matching for Federated Learning. 254-263 - Jiyao Liu
, Xinliang Wei
, Xuanzhang Liu
, Hongchang Gao
, Yu Wang
:
Group-based Hierarchical Federated Learning: Convergence, Group Formation, and Sampling. 264-273 - Feiwen Zhu
, Michal Futrega
, Han Bao
, Sukru Burc Eryilmaz
, Fei Kong
, Kefeng Duan
, Xinnian Zheng
, Nimrod Angel
, Matthias Jouanneaux
, Maxmilian Stadler
, Michal Marcinkiewicz
, Fung Xie
, June Yang
, Michael Andersch
:
FastDimeNet++: Training DimeNet++ in 22 minutes. 274-284
Communication (In Person)
- Thomas Gillis
, Ken Raffenetti
, Hui Zhou
, Yanfei Guo
, Rajeev Thakur
:
Quantifying the Performance Benefits of Partitioned Communication in MPI. 285-294 - George Katevenis
, Manolis Ploumidis
, Manolis Marazakis
:
Impact of Cache Coherence on the Performance of Shared-Memory based MPI Primitives: A Case Study for Broadcast on Intel Xeon Scalable Processors. 295-305 - Whit Schonbein
, Scott Levy
, Matthew G. F. Dosanjh
, W. Pepper Marts
, Elizabeth Reid
, Ryan E. Grant
:
Modeling and Benchmarking the Potential Benefit of Early-Bird Transmission in Fine-Grained Communication. 306-316
System Software (Remote Session)
- Tiannuo Yang
, Ruobing Chen
, Yusen Li
, Xiaoguang Liu
, Gang Wang
:
CoTuner: A Hierarchical Learning Framework for Coordinately Optimizing Resource Partitioning and Parameter Tuning. 317-326 - Jingrun Zhang
, Guangba Yu
, Zilong He
, Liang Ai
, Pengfei Chen
:
DeepPower: Deep Reinforcement Learning based Power Management for Latency Critical Applications in Multi-core Systems. 327-336 - Yi Bian
, Fangyu Zheng
, Yuewu Wang
, Lingguang Lei
, Yuan Ma
, Jiankuo Dong
, Jiwu Jing
:
AsyncGBP: Unleashing the Potential of Heterogeneous Computing for SSL/TLS with GPU-based Provider. 337-346 - Benran Wang
, Hongyang Chen
, Pengfei Chen
, Zilong He
, Guangba Yu
:
MARS: Fault Localization in Programmable Networking Systems with Low-cost In-Band Network Telemetry. 347-357 - Xianzhi Zhu
, Yongkun Li
, Lulu Yao
, Zhihao Qi
, Yinlong Xu
, Pengcheng Wang
, Weiguang Wang
, Xia Zhu
:
On Optimizing Traffic Scheduling for Multi-replica Containerized Microservices. 358-368 - Xinxin Qi
, Juan Chen
, Yong Dong
, Yuan Yuan
, Tao Xu
, Rongyu Deng
, Zekai Li
, Kexing Zhou
, Zheng Wang
:
HighRPM: Combining Integrated Measurement and Sofware Power Modeling for High-Resolution Power Monitoring. 369-379
Applications (In Person)
- Suneth Dasantha Ekanayake
, István Zoltan Reguly
, Fabio Luporini
, Gihan Ravideva Mudalige
:
Communication-Avoiding Optimizations for Large-Scale Unstructured-Mesh Applications with OP2. 380-391 - Abbas Haghi
, Lluc Alvarez
, Jordi Front
, Juan Miguel De Haro Ruiz
, Roger Figueras
, Max Doblas
, Santiago Marco-Sola
, Miquel Moretó
:
WFAsic: A High-Performance ASIC Accelerator for DNA Sequence Alignment on a RISC-V SoC. 392-401 - Jiechao Gao
, Wenpeng Wang
, Fateme Nikseresht
, Viswajith Govinda Rajan
, Bradford Campbell
:
PFDRL: Personalized Federated Deep Reinforcement Learning for Residential Energy Management. 402-411
Resource Scheduling and Adaptation (In Person)
- Hengwei Xu
, Pengyuan Zhou
, Haiyong Xie
, Yong Liao
:
Mercury: Fast and Optimal Device Placement for Large Deep Learning Models. 412-422 - Suraiya Tairin
, Haiying Shen
, Zeyu Zhang
:
Embracing Uncertainty for Equity in Resource Allocation in ML Training. 423-432 - Ghazanfar Ali
, Mert Side
, Sridutt Bhalachandra
, Nicholas J. Wright
, Yong Chen
:
Performance-Aware Energy-Efficient GPU Frequency Selection using DNN-based Models. 433-442
Federated Learning (Remote Session)
- Jieling Yu
, Ruiting Zhou
, Chen Chen
, Bo Li
, Fang Dong
:
ASFL: Adaptive Semi-asynchronous Federated Learning for Balancing Model Accuracy and Total Latency in Mobile Edge Networks. 443-451 - Mengyao Du
, Miao Zhang
, Lin Liu
, Kai Xu
, Quanjun Yin
:
Credit-based Differential Privacy Stochastic Model Aggregation Algorithm for Robust Federated Learning via Blockchain. 452-461 - Songli Zhang
, Zhenzhe Zheng
, Fan Wu
, Bingshuai Li
, Yunfeng Shao
, Guihai Chen
:
Learning From Your Neighbours: Mobility-Driven Device-Edge-Cloud Federated Learning. 462-471 - Qingyuan Wang
, Bin Gao
, Zhi Zhou
, Fei Xu
, Chenghao Ouyang
:
DAG-Aware Optimization for Geo-Distributed Data Analytics. 472-481 - YuAng Chen
, Yeh-Ching Chung
:
Connectivity-Aware Link Analysis for Skewed Graphs. 482-491 - Haishuang Fan
, Ming Li
, Jingya Wu
, Wenyan Lu
, Xiaowei Li
, Guihai Yan
:
BitColor: Accelerating Large-Scale Graph Coloring on FPGA with Parallel Bit-Wise Engines. 492-502
Graph-Related Techniques (In Person)
- Andrey Prokopenko
, Damien Lebrun-Grandié
, Daniel Arndt
:
Fast tree-based algorithms for DBSCAN for low-dimensional data on GPUs. 503-512 - Qinglin Lu
, Xinyu Wang
, Wenjing Ma
, Yuwen Zhao
, Daokun Chen
, Fangfang Liu
:
GFFT: a Task Graph Based Fast Fourier Transform Optimization Framework. 513-523 - Octavi Obiols-Sales
, Abhinav Vishnu
, Nicholas Malaya
, Aparna Chandramowlishwaran
:
ADARNet: Deep Learning Predicts Adaptive Mesh Refinement. 524-534
Memory and Storage (In Person)
- Louis-Claude Canon
, Anthony Dugois
, Loris Marchal
, Etienne Rivière
:
Hector: A Framework to Design and Evaluate Scheduling Strategies in Persistent Key-Value Stores. 535-545 - Jong-Hyun Jeong
, Myung Kuk Yoon
, Yunho Oh
, Gunjae Koo
:
Warped-MC: An Efficient Memory Controller Scheme for Massively Parallel Processors. 546-555
Networks (Remote Session)
- Fei Dai
, Yawen Chen
, Zhiyi Huang
, Haibo Zhang
:
Wrht: Efficient All-reduce for Distributed DNN Training in Optical Interconnect Systems. 556-565 - Hao Zhang, Yawen Chen, Zhiyi Huang, Haibo Zhang, Fei Dai:
SEECHIP: A Scalable and Energy-Efficient Chiplet-based GPU Architecture Using Photonic Links. 566-575 - Jinbin Hu
, Yi He
, Jin Wang
, Wangqing Luo
, Jiawei Huang
:
RLB: Reordering-Robust Load Balancing in Lossless Datacenter Networks. 576-584
Scheduling (Remote Session)
- Hehuan Shi
, Lin Chen, Ming Lin
, Rapharl Phan
:
Scheduling Dependent Batching Tasks. 585-594 - Yicheng Feng
, Shihao Shen
, Mengwei Xu
, Yuanming Ren
, Xiaofei Wang
, Victor C. M. Leung
, Wenyu Wang:
Tango: Harmonious Management and Scheduling for Mixed Services Co-located among Distributed Edge-Clouds. 595-604 - Diaohan Luo
, Tian Yu
, Yuewen Wu
, Heng Wu
, Tao Wang
, Wenbo Zhang
:
SPLIT: QoS-Aware DNN Inference on Shared GPU via Evenly-Sized Model Splitting. 605-614 - Huadong Li
, Hui Liu
, Changyuan Liu
, Aoqi Chen
, Zhaocheng Niu
, Junzhao Du
:
NeiLatS: Neighbor-Aware Latency-Sensitive Application Scheduling in Heterogeneous Cloud-Edge Environment. 615-624
Inference (In Person)
- Xueyu Hou
, Yongjie Guan
, Tao Han
:
Dystri: A Dynamic Inference based Distributed DNN Service Framework on Edge. 625-634 - Jianfeng Gu
, Yichao Zhu
, Puxuan Wang
, Mohak Chadha
, Michael Gerndt
:
FaST-GShare: Enabling Efficient Spatio-Temporal GPU Sharing in Serverless Computing for Deep Learning Inference. 635-644 - Beilei Jiang
, Xianwei Cheng
, Yuan Li
, Jocelyn Zhang
, Song Fu
, Qing Yang
, Mingxiong Liu
, Alejandro Olvera
:
Output-Directed Dynamic Quantization for DNN Acceleration. 645-654
Compilation and Checkpointing Techniques (In Person)
- Jan Hückelheim
, Johannes Doerfert
:
ORAQL - Optimistic Responses to Alias Queries in LLVM. 655-664 - Nigel Tan
, Jakob Lüttgau
, Jack Marquez
, Keita Teranishi
, Nicolas Morales
, Sanjukta Bhowmick
, Franck Cappello
, Michela Taufer
, Bogdan Nicolae
:
Scalable Incremental Checkpointing using GPU-Accelerated De-Duplication. 665-674 - Masaki Nakata
, Shigeyuki Sato
, Tomoharu Ugawa
:
General-purpose Asynchronous Periodic Checkpointing in Hybrid Memory. 675-684
Memory and Storage (Remote Session)
- Zhenlin Qi
, Shengan Zheng
, Yifeng Hui
, Bowen Zhang
, Linpeng Huang
:
Conflux: Exploiting Persistent Memory and RDMA Bandwidth via Adaptive I/O Mode Selection. 685-694 - Hang An
, Fang Wang
, Dan Feng
, Xiaomin Zou
, Zefeng Liu
, Jianshun Zhang
:
Marlin: A Concurrent and Write-Optimized B+-tree Index on Disaggregated Memory. 695-704 - Weiming Huang
, Yajuan Du
, Mingyang Liu
:
GPU Performance Acceleration via Intra-Group Sharing TLB. 705-714 - Baorong Ding
, Mingcong Han
, Rong Chen
:
DArray: A High Performance RDMA-Based Distributed Array. 715-724 - Hao Zhao
, Si Wu
, Haifeng Liu
, Zhixiang Tang
, Xiaochun He
, Yinlong Xu
:
Toward Optimal Repair and Load Balance in Locally Repairable Codes. 725-735 - Zhigang Cai, Chengyong Tang, Minjun Li, François Trahay, Jun Li, Zhibing Sha, Jiaojiao Wu, Fan Yang, Jianwei Liao:
Re-aligning Across-page Requests for Flash-based Solid-state Drives. 736-745
Optimization of AI/ML (In Person)
- Daegun Yoon
, Sangyoon Oh
:
DEFT: Exploiting Gradient Norm Difference between Model Layers for Scalable Gradient Sparsification. 746-755 - Shenggui Li
, Hongxin Liu
, Zhengda Bian
, Jiarui Fang
, Haichen Huang
, Yuliang Liu
, Boxiang Wang
, Yang You
:
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training. 766-775
Numerics (Remote Session)
- Jie Yan
, Zhang Yang
, Aiqing Zhang
, Zeyao Mo
:
JSweep: A Patch-centric Data-driven Approach for Parallel Sweeps on Large-scale Meshes. 776-785 - Mingzhen Li
, Hailong Yang
, Shanjun Zhang
, Fengwei Yu
, Ruihao Gong
, Yi Liu
, Zhongzhi Luan
, Depei Qian
:
Exploiting Subgraph Similarities for Efficient Auto-tuning of Tensor Programs. 786-796 - Zhao Liu
, Xuesen Chu
, Xiaojing Lv
, Hanyue Liu
, Haohuan Fu
, Guangwen Yang
:
Accelerating Large-Scale CFD Simulations with Lattice Boltzmann Method on a 40-Million-Core Sunway Supercomputer. 797-806 - Helin Cheng
, Wenxuan Li
, Yuechen Lu
, Weifeng Liu
:
HASpGEMM: Heterogeneity-Aware Sparse General Matrix-Matrix Multiplication on Modern Asymmetric Multicore Processors. 807-817 - Ran Zhao
, Chao Li
, Xiaowei Guo
, Yi Liu
, Sifan Long
, Sen Zhang
, Yanlong Qiu
, Canqun Yang
:
An Improved Parallel Overset Grid Method for Fluid Simulation with Moving Boundary. 818-827 - Jing Chen
, Madhavan Manivannan
, Bhavishya Goel
, Miquel Pericàs
:
JOSS: Joint Exploration of CPU-Memory DVFS and Task Scheduling for Energy Efficiency. 828-838

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.