


default search action
CCF Transactions on High Performance Computing, Volume 7
Volume 7, Number 1, February 2025
- Hanzheng Liang, Chencheng Deng, Peng Zhang, Jianbin Fang, Tao Tang, Chun Huang:

An empirical performance evaluation of SYCL on ARM multi-core processors. 1-16 - Youxuan Xu, Tong Wu, Shigang Li

, Xueying Wang, Jingjing Wang:
SparkAttention: high-performance multi-head attention for large models on Volta GPU architecture. 17-28 - Tao Huang, Yonggui Liang, Shubao Yu, Kexin Chen:

TxCocket: an innovative solution for efficient cross-node data transmission enabled by CXL-based shared memory. 29-42 - Wenhao Dai, Ziyi Jia, Yuesi Bai, Qingxiao Sun

:
Convergence-aware operator-wise mixed-precision training. 43-57 - Jin Zhang, Jincheng Zhou, Xiang Zhang, Di Ma, Chunye Gong:

Fine-grained vectorized merge sorting on RISC-V: from register to cache. 58-71 - Muchun Peng, Qinglin Wang, Yuechao Liang, Weihao Guo, Shun Yang, Yaling Liang, Yongzhen Shi, Ligang Cao, Jie Liu:

GreenB+Tree: an energy-efficient B+tree for MIMD architectures. 72-84
Volume 7, Number 2, April 2025
- Pin Chen

, Qing Mo, Zexin Xu, Xianwei Zhang, Yutong Lu:
Star-gen: an HPC-AI framework for constructing large-scale computational materials database. 85-99 - Wentao Feng, Shizhe Shang, Pengfei Li, Hailong Yang, Zhongzhi Luan

, Depei Qian:
SyncNOVA: an end-to-end fine-grained profiling tool oN lOck behaVior detection and critical section diAgnosis. 100-113 - Ningxi Tian, Silu Huang, Xiaowen Xu

:
Mixed precision block-Jacobi preconditioner: algorithms, performance evaluation and feature analysis. 114-128 - Jianfei Xu, Lianhua He, Zhong Jin:

Mixed precision SpMV on GPUs for irregular data with hierarchical precision selection. 129-141 - Wenlong Fan, Haobo Hua

, Jiandong Shang
, Zhuxin Wen, Hengliang Guo, Litao Zhang:
Optimizing 2D convolution for DCUs. 142-154 - Xiangyu Meng, Xun Wang

, Mingzhen Li, Guangming Tan, Weile Jia:
An interpretable DeePMD-kit performance model for emerging supercomputers. 155-168 - Heming Zhong, Xiaojian Pan, Zengquang He, Haoling Wang, Dan Huang, Zhiguang Chen:

GPU acceleration for DNA sequence alignment algorithm and its application. 169-177
Volume 7, Number 3, June 2025
- Zhao Mao, Xingjun Zhang

, Longxiang Wang:
KANETAS: an elastic scheduler for heterogeneous many-core systems. 179-193 - Dongting Chen, Jie Shen, Chun Huang, Xin Yi:

An empirical study of error-free transformations for enhancing mathematical function precision. 194-210 - Hengzhong Liang, Han Huang, Xianwei Zhang

:
SuCL: supply unified communication layer to improve SYCL-based heterogeneous computing. 211-225 - Zhangjie Tan, Jinfang Jia, Zhengsheng Ning, Jianqiang Huang, Xiaoying Wang:

Research on GPU transplantation optimization of PRM scalar advection scheme in GRAPES global forecast system. 226-244 - Yalin Zhu, Youquan Chang, Jiapeng Zhang

, Yingjie Song, Zhuo Tang:
An optimized hierarchical MapReduce framework in supercomputing Internet environment. 245-259 - Da Huo, Xin You, Zhibo Xuan, Hailong Yang

, Zhongzhi Luan, Depei Qian:
Hotspy: identifying performance hotspot with graph neural network based static analysis. 260-274 - Yunkun Liao, Jingya Wu

, Wenyan Lu, Huawei Li
, Xiaowei Li, Guihai Yan:
FUS: FPGA-based Universal Sketch with homogeneous and heterogeneous memory architectures. 275-290 - Ronghui Cao, Peng Zhang, Yiming Wu, Jun Liu, Haibin Su:

Adaptive container scheduling based on reinforcement learning in kubernetes. 291-304
Volume 7, Number 5, October 2025
- Kai Di

, Pan Li, Tienyu Zuo, Fulin Chen, Yuanshuang Jiang, Lei Kong, Yichuan Jiang, Dan Chen:
Optimizing data interaction strategies for unreliable agents in multiplex networked industrial environments. 379-402 - Xiaoyong Tang, Xiaotian Li, Ronghui Cao:

An online resource-aware leader election algorithm based on Kubernetes load balancing. 403-412 - Edward Chuah

, Arshad Jhumka, Sai Narasimhamurthy, Aladdin Ayesh
:
Deep learning-based prediction of major page faults in cluster systems. 413-430 - Xiaoyong Fan, Yuan Zhuang, Yunhui Zeng

:
A barotropic solver capable of reducing global synchronization latency in parallel ocean program. 431-446 - Xiaoning Wang, Yining Zhao, Shasha Lu, Haili Xiao:

Practice and observation: live migration for MPI workload. 447-464 - Mengsi He, Zhongming Fu

, Wenlong Tian:
Optimization of fault tolerance for iterative graph algorithm in spark GraphX based on high performance computing cluster. 465-477

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














