


default search action
28th HiPC 2021: Bengaluru, India
- 28th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2021, Bengaluru, India, December 17-20, 2021. IEEE 2021, ISBN 978-1-6654-1016-8

- Adam Belay:

Improving Efficiency and Performance Through Faster Scheduling Mechanisms. xxii - Jingren Zhou:

Towards an Integral System for Processing Big Graphs at Scale. xxi - Chi Zhang, Sanmukh Rao Kuppannagari

, Viktor K. Prasanna:
Parallel Actors and Learners: A Framework for Generating Scalable RL Implementations. 1-10 - Michela Taufer:

AI4IO: A Suite of Ai-Based Tools for IO-Aware HPC Resource Management. 1 - Amal Gueroudji

, Julien Bigot
, Bruno Raffin
:
DEISA: Dask-Enabled In Situ Analytics. 11-20 - A. Srinivas Reddy, P. Krishna Reddy, Anirban Mondal, U. Deva Priyakumar:

A Model of Graph Transactional Coverage Patterns with Applications to Drug Discovery. 21-30 - Eliza Wszola, Martin Jaggi, Markus Püschel:

Faster Parallel Training of Word Embeddings. 31-41 - Nariaki Tateiwa, Yuji Shinano

, Keiichiro Yamamura, Akihiro Yoshida, Shizuo Kaji
, Masaya Yasuda
, Katsuki Fujisawa
:
CMAP-LAP: Configurable Massively Parallel Solver for Lattice Problems. 42-52 - Hwajung Kim, Jiwoo Bang, Dong Kyu Sung, Hyeonsang Eom, Heon Y. Yeom, Hanul Sung:

MulConn: User-Transparent I/O Subsystem for High-Performance Parallel File Systems. 53-62 - Ta-Yang Wang, William Chang, Ajitesh Srivastava, Rajgopal Kannan, Viktor K. Prasanna:

Monte Carlo Tree Search for Task Mapping onto Heterogeneous Platforms. 63-70 - Johannes Langguth, Ioannis Panagiotas, Bora Uçar:

Shared-memory implementation of the Karp-Sipser kernelization process. 71-80 - Yuan Meng

, Sanmukh R. Kuppannagari
, Rajgopal Kannan, Viktor K. Prasanna:
How to Avoid Zero-Spacing in Fractionally-Strided Convolution? A Hardware-Algorithm Co-Design Methodology. 81-90 - Jiawen Guan, Rui Fan:

PPBT: A High Performance Parallel Search Tree. 91-100 - Esragul Korkmaz

, Mathieu Faverge, Pierre Ramet, Grégoire Pichon:
Deciding Non-Compressible Blocks in Sparse Direct Solvers using Incomplete Factorization. 101-110 - Athreya Chandramouli, Sayantan Jana, Kishore Kothapalli:

Efficient Parallel Algorithms for Computing Percolation Centrality. 111-120 - André Weißenberger, Bertil Schmidt

:
Accelerating JPEG Decompression on GPUs. 121-130 - Kai Keller

, Adrián Cristal Kestelman, Leonardo Bautista-Gomez:
Towards Zero-Waste Recovery and Zero-Overhead Checkpointing in Ensemble Data Assimilation. 131-140 - Archie Powell, K. Choudry, Arun Prabhakar

, I. Z. Reguly, Dario Amirante, Stephen A. Jarvis
, Gihan R. Mudalige
:
Predictive Analysis of Large-Scale Coupled CFD Simulations with the CPX Mini-App. 141-151 - Akihiro Tabuchi, Koichi Shirahata, Masafumi Yamazaki, Akihiko Kasagi, Takumi Honda, Kouji Kurihara, Kentaro Kawakami, Tsuguchika Tabaru, Naoto Fukumoto, Akiyoshi Kuroda, Takaaki Fukai, Kento Sato:

The 16, 384-node Parallelism of 3D-CNN Training on An Arm CPU based Supercomputer. 152-161 - Luk Burchard

, Xing Cai, Johannes Langguth:
iPUG for Multiple Graphcore IPUs: Optimizing Performance and Scalability of Parallel Breadth-First Search. 162-171 - K. P. Arun, Debadatta Mishra, Biswabandan Panda

:
Empirical Analysis of Architectural Primitives for NVRAM Consistency. 172-181 - Kazuaki Matsumura, Simon Garcia de Gonzalo, Antonio J. Peña:

JACC: An OpenACC Runtime Framework with Kernel-Level and Multi-GPU Parallelization. 182-191 - Oded Green, Zhihui Du, Sanyamee Patel, Zehui Xie, Hang Liu, David A. Bader:

Anti-Section Transitive Closure. 192-201 - Xiaojing An, Ümit V. Çatalyürek:

Column-Segmented Sparse Matrix-Matrix Multiplication on Multicore CPUs. 202-211 - Arjun Gopala Krishnan, Dhrubajyoti Goswami:

Multi-Stage Memory Efficient Strassen's Matrix Multiplication on GPU. 212-221 - Md Nahid Newaz

, Md Atiqul Mollah:
Optimizing k-path selection for randomized interconnection networks. 222-231 - Siqin Liu, Avinash Karanth:

Dynamic Voltage and Frequency Scaling to Improve Energy-Efficiency of Hardware Accelerators. 232-241 - Zhe Wang, Pradeep Subedi

, Matthieu Dorier
, Philip E. Davis, Manish Parashar:
Adaptive Placement of Data Analysis Tasks For Staging Based In-Situ Processing. 242-251 - Qihan Wang, Wei Niu

, Li Chen, Ruoming Jin, Bin Ren:
HEALS: A Parallel eALS Recommendation System on CPU/GPU Heterogeneous Platforms. 252-261 - Xiang Li, Gagan Agrawal:

Shrinking Sample Search Algorithm for Automatic Tuning of GPU Kernels. 262-271 - Bharath Ramesh, Jahanzeb Maqbool Hashmi, Shulei Xu, Aamir Shafi

, Seyedeh Mahdieh Ghazimirsaeed, Mohammadreza Bayatpour, Hari Subramoni, Dhabaleswar K. Panda:
Towards Architecture-aware Hierarchical Communication Trees on Modern HPC Systems. 272-281 - Yuntian He, Saket Gurukar, Pouya Kousha

, Hari Subramoni, Dhabaleswar K. Panda, Srinivasan Parthasarathy:
DistMILE: A Distributed Multi-Level Framework for Scalable Graph Embedding. 282-291 - Jinlai Xu, Balaji Palanisamy:

Model-based Reinforcement Learning for Elastic Stream Processing in Edge Computing. 292-301 - Kaushik Kandadi Suresh, Bharath Ramesh, Chen-Chun Chen, Seyedeh Mahdieh Ghazimirsaeed, Mohammadreza Bayatpour, Aamir Shafi

, Hari Subramoni, Dhabaleswar K. Panda:
Layout-aware Hardware-assisted Designs for Derived Data Types in MPI. 302-311 - Xu T. Liu

, Jesun Firoz, Andrew Lumsdaine
, Cliff A. Joslyn
, Sinan Aksoy, Brenda Praggastis, Assefaw H. Gebremedhin:
Parallel Algorithms for Efficient Computation of High-Order Line Graphs of Hypergraphs. 312-321 - Sunwoo Lee

, Qiao Kang, Kewei Wang, Jan Balewski, Alex Sim
, Ankit Agrawal
, Alok N. Choudhary, Peter Nugent
, Kesheng Wu
, Wei-keng Liao
:
Asynchronous I/O Strategy for Large-Scale Deep Learning Applications. 322-331 - Srinivasan Ramesh, Robert B. Ross, Matthieu Dorier

, Allen D. Malony, Philip H. Carns, Kevin A. Huck
:
SYMBIOMON: A High-Performance, Composable Monitoring Service. 332-342 - Ke Fan, Duong Hoang, Steve Petruzza, Thomas Gilray, Valerio Pascucci

, Sidharth Kumar:
Load-balancing Parallel I/O of Compressed Hierarchical Layouts. 343-353 - Madhav Poudel, Michael Gowanlock:

CUDA-DClust+: Revisiting Early GPU-Accelerated DBSCAN Clustering Designs. 354-363 - Leonel Toledo, Pedro Valero-Lara, Jeffrey S. Vetter, Antonio J. Peña:

Static Graphs for Coding Productivity in OpenACC. 364-369 - Madhav Aggarwal

, Bingyi Zhang, Viktor K. Prasanna:
Performance of Local Push Algorithms for Personalized PageRank on Multi-core Platforms. 370-375 - Jacob Tronge, Patricia Grubel, Timothy Randles, Quincy Wofford, Rusty Davis, Steven Anaya, Qiang Guan:

BEE Orchestrator: Running Complex Scientific Workflows on Multiple Systems. 376-381 - Hércules Cardoso da Silva, Marco Aurelio Stefanes, Vinícius Capistrano:

OpenACC Multi-GPU Approach for WSM6 Microphysics. 382-387 - Nick Sarkauskas, Mohammadreza Bayatpour, Tu Tran

, Bharath Ramesh, Hari Subramoni, Dhabaleswar K. Panda:
Large-Message Nonblocking MPI_Iallgather and MPI Ibcast Offload via BlueField-2 DPU. 388-393 - Yuanjian Liu

, Sheng Di, Kai Zhao, Sian Jin, Cheng Wang, Kyle Chard, Dingwen Tao
, Ian T. Foster, Franck Cappello:
Optimizing Multi-Range based Error-Bounded Lossy Compression for Scientific Datasets. 394-399 - Jiwoo Bang, Chungyong Kim, Kesheng Wu

, Alex Sim
, Suren Byna
, Hanul Sung, Hyeonsang Eom:
An In-Depth I/O Pattern Analysis in HPC Systems. 400-405 - Anshuj Garg, Purushottam Kulkarni, Umesh Bellur

, Sriram Yenamandra:
FaaSter: Accelerated Functions-as-a-Service with Heterogeneous GPUs. 406-411 - Salman Salloum

, Joshua Zhexue Huang:
RSP-Hist: Approximate Histograms for Big Data Exploration on Hadoop Clusters. 412-417 - Shuangsheng Lou, Gagan Agrawal:

A Programming API Implementation for Secure Data Analytics Applications with Homomorphic Encryption on GPUs. 418-423 - Jia Guo, Radu Teodorescu, Gagan Agrawal:

A Fused Inference Design for Pattern-Based Sparse CNN on Edge Devices. 424-429 - Edigley Fraga, Ana Cortés, Tomàs Margalef

, Porfidio Hernández:
Cloud-Based Urgent Computing for Forest Fire Spread Prediction under Data Uncertainties. 430-435 - Mostafa Eghbali Zarch, Reece Neff

, Michela Becchi:
Exploring Thread Coarsening on FPGA. 436-441 - John Ravi, Tri Nguyen

, Huiyang Zhou
, Michela Becchi:
PILOT: a Runtime System to Manage Multi-tenant GPU Unified Memory Footprint. 442-447 - S. Chandra Sekhara Rao, Rabia Kamra

:
A computational technique for parallel solution of diagonally dominant banded linear systems. 448-453

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














