


default search action
24th PPoPP 2019: Washington, DC, USA
- Jeffrey K. Hollingsworth, Idit Keidar:

Proceedings of the 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2019, Washington, DC, USA, February 16-20, 2019. ACM 2019, ISBN 978-1-4503-6225-2 - Joel Hestness, Newsha Ardalani, Gregory F. Diamos:

Beyond human-level accuracy: computational challenges in deep learning. 1-14 - Junmin Xiao, Shijie Wang, Weiqiang Wan, Xuehai Hong, Guangming Tan:

S-EnKF: co-designing for scalable ensemble Kalman filter. 15-26 - Isaac Gelado, Michael Garland:

Throughput-oriented GPU memory allocation. 27-37 - Hao Wang, Liang Geng, Rubao Lee, Kaixi Hou, Yanfeng Zhang, Xiaodong Zhang:

SEP-graph: finding shortest execution paths for graph processing under a hybrid framework on GPU. 38-52 - Troels Henriksen, Frederik Thorøe, Martin Elsman

, Cosmin E. Oancea
:
Incremental flattening for nested data parallelism. 53-67 - Martin Winter

, Daniel Mlakar, Rhaleb Zayer, Hans-Peter Seidel, Markus Steinberger
:
Adaptive sparse matrix-matrix multiplication on the GPU. 68-81 - Brijesh Dongol

, Radha Jagadeesan, James Riely
:
Modular transactions: bounding mixed races in space and time. 82-93 - Ryan Yates, Michael L. Scott

:
Leveraging hardware TM in Haskell. 94-106 - Ricardo Filipe, Shady Issa

, Paolo Romano
, João Barreto
:
Stretching the capacity of hardware transactional memory in IBM POWER architectures. 107-119 - Mohamed M. Saad, Masoomeh Javidi Kishi, Shihao Jing, Sandeep Hans, Roberto Palmieri

:
Processing transactions in a predefined order. 120-132 - Zhaofeng Yan, Yuzhe Lin, Lu Peng, Weihua Zhang:

Harmonia: a high throughput B+tree for GPUs. 133-144 - Muhammad A. Awad

, Saman Ashkiani, Rob Johnson, Martin Farach-Colton
, John D. Owens:
Engineering a high-performance GPU B-Tree. 145-157 - Xiaokang Hu

, Changzheng Wei, Jian Li, Brian Will, Ping Yu, Lu Gong, Haibing Guan:
QTLS: high-performance TLS asynchronous offload framework with Intel® QuickAssist technology. 158-172 - Fabian Gruber, Manuel Selva, Diogo Sampaio, Christophe Guillon, Antoine Moynault, Louis-Noël Pouchet, Fabrice Rastello:

Data-flow/dependence profiling for structured transformations. 173-185 - Qingsen Wang, Pengfei Su, Milind Chabbi, Xu Liu:

Lightweight hardware transactional memory profiling. 186-200 - Ke Meng, Jiajia Li

, Guangming Tan, Ninghui Sun:
A pattern based algorithmic autotuner for graph processing on GPUs. 201-213 - Umut A. Acar, Vitaly Aksenov

, Arthur Charguéraud, Mike Rainey:
Provably and practically efficient granularity control. 214-228 - Xiuhong Li, Yun Liang, Shengen Yan, Liancheng Jia, Yinghan Li:

A coordinated tiling and batching framework for efficient GEMM on GPUs. 229-241 - Qi Zhao, Zhengyi Qiu, Guoliang Jin:

Semantics-aware scheduling policies for synchronization determinism. 242-256 - Kyle Singer

, Yifan Xu, I-Ting Angelina Lee:
Proactive work stealing for futures. 257-271 - Loc Hoang, Matteo Pontecorvi, Roshan Dathathri, Gurbinder Gill, Bozhi You, Keshav Pingali, Vijaya Ramachandran:

A round-efficient distributed betweenness centrality algorithm. 272-286 - Martin Küttler, Maksym Planeta, Jan Bierbaum, Carsten Weinhold, Hermann Härtig, Amnon Barak, Torsten Hoefler:

Corrected trees for reliable group communication. 287-299 - Changwan Hong, Aravind Sukumaran-Rajam

, Israt Nisa, Kunal Singh, P. Sadayappan
:
Adaptive sparse tiling for sparse matrix multiplication. 300-314 - Martin Bättig

, Thomas R. Gross:
Encapsulated open nesting for STM: fine-grained higher-level conflict detection. 315-326 - Herbert Jordan, Pavle Subotic, David Zhao

, Bernhard Scholz:
A specialized B-tree for concurrent datalog evaluation. 327-339 - Robert Utterback, Kunal Agrawal, Jeremy T. Fineman, I-Ting Angelina Lee:

Efficient race detection with futures. 340-354 - Simon Doherty, Brijesh Dongol

, Heike Wehrheim, John Derrick
:
Verifying C11 programs operationally. 355-365 - Burcu Kulahcioglu Ozkan

, Rupak Majumdar, Filip Niksic:
Checking linearizability using hitting families. 366-377 - Caleb Voss, Tiago Cogumbreiro

, Vivek Sarkar:
Transitive joins: a sound and efficient online deadlock-avoidance policy. 378-390 - Jiawen Sun, Hans Vandierendonck, Dimitrios S. Nikolopoulos

:
VEBO: a vertex- and edge-balanced ordering heuristic to load balance parallel graph processing. 391-392 - Kartik Lakhotia, Rajgopal Kannan, Sourav Pati, Viktor K. Prasanna:

GPOP: a cache and memory-efficient framework for graph processing over partitions. 393-394 - Somesh Singh

, Rupesh Nasre:
Optimizing graph processing on GPUs using approximate computing: poster. 395-396 - Jinrong Guo, Wantao Liu, Wang Wang, Qu Lu, Songlin Hu

, Jizhong Han
, Ruixuan Li:
A GPU memory efficient speed-up scheme for training ultra-deep neural networks: poster. 397-398 - Yuki Ito, Haruki Imai, Tung D. Le, Yasushi Negishi, Kiyokuni Kawachiya, Ryo Matsumiya, Toshio Endo:

Profiling based out-of-core hybrid method for large neural networks: poster. 399-400 - Xiao Dong

, Lei Liu, Guangli Li
, Jiansong Li
, Peng Zhao, Xueying Wang
, Xiaobing Feng:
Exploiting the input sparsity to accelerate deep neural networks: poster. 401-402 - Peng Jiang, Gagan Agrawal:

Accelerating distributed stochastic gradient descent with adaptive periodic parameter averaging: poster. 403-404 - Putt Sakdhnagool, Amit Sabne, Rudolf Eigenmann:

Optimizing GPU programs by register demotion: poster. 405-406 - Yubin Chen, Zhuocheng Ding, Jin Zhang, Yun Wang

, Zhengwei Qi, Haibing Guan:
A distributed hypervisor for resource aggregation: poster. 407-408 - Mohamed Lamine Karaoui, Anthony Carno, Robert Lyerly, Sang-Hoon Kim, Pierre Olivier, Changwoo Min, Binoy Ravindran

:
Scheduling HPC workloads on heterogeneous-ISA architectures: poster. 409-410 - Da Yan, Guimu Guo, Md Mashiur Rahman Chowdhury, M. Tamer Özsu

, John C. S. Lui, Weida Tan:
T-thinker: a task-centric distributed framework for compute-intensive divide-and-conquer algorithms. 411-412 - Mohammad Mahdi Javanmard, Pramod Ganapathi, Rathish Das, Zafar Ahmad, Stephen L. Tschudi, Rezaul Chowdhury:

Toward efficient architecture-independent algorithms for dynamic programs: poster. 413-414 - Emilio Castillo, Nikhil Jain, Marc Casas

, Miquel Moretó
, Martin Schulz
, Ramón Beivide, Mateo Valero, Abhinav Bhatele:
Optimizing computation-communication overlap in asynchronous task-based programs: poster. 415-416 - Nikita Koval, Dan Alistarh, Roman Elizarov:

Lock-free channels for programming via communicating sequential processes: poster. 417-418 - Naama Ben-David, Guy E. Blelloch, Michal Friedman

, Yuanhao Wei:
Making concurrent algorithms detectable: poster. 419-420 - Kunpeng Wang, Shizhen Xu, Hongkun Yu, Haohuan Fu, Guangwen Yang:

GPU-based 3D cryo-EM reconstruction with key-value streams: poster. 421-422 - Athena Elafrou, Georgios I. Goumas, Nectarios Koziris:

BASMAT: bottleneck-aware sparse matrix-vector multiplication auto-tuning on GPGPUs. 423-424 - Avner Elizarov, Guy Golan-Gueta, Erez Petrank:

LOFT: lock-free transactional data structures. 425-426 - Xiang Ni, Scott Schneider, Raju Pavuluri, Jonathan Kaus, Kun-Lung Wu:

Automated multi-dimensional elasticity for streaming runtimes: poster. 427-428 - Marcelo Novaes, Vinicius Petrucci, Abdoulaye Gamatié, Fernando Magno Quintão Pereira:

Compiler-assisted adaptive program scheduling in big.LITTLE systems: poster. 429-430 - Chanyoung Oh, Zhen Zheng, Xipeng Shen

, Jidong Zhai, Youngmin Yi:
GOPipe: a granularity-oblivious programming framework for pipelined stencil executions on GPU. 431-432 - Tim Kaler, Brian Wheatman

, Sarah Wooders:
High-throughput image alignment for connectomics using frugal snap judgments: poster. 433-434 - Xiaolong Xie, Yun Liang, Xiuhong Li, Wei Tan:

CuLDA_CGS: solving large-scale LDA problems on GPUs. 435-436 - Sharanyan Srikanthan, Princeton Ferro, Sayak Chakraborti, Sandhya Dwarkadas

:
Managing application parallelism via parallel efficiency regulation: poster. 437-438 - Emmanuelle Anceaume, Antonella Del Pozzo, Romaric Ludinard, Maria Potop-Butucaru, Sara Tucci Piergiovanni:

Blockchain abstract data type: poster. 439-440 - Ivo Jimenez

, Jay F. Lofstead
, Carlos Maltzahn:
Creating repeatable, reusable experimentation pipelines with popper: tutorial. 441-442 - Travis Carlson, Eric Van Wyk:

Building parallel programming language constructs in the AbleC extensible C compiler framework: a PPoPP tutorial. 443-446 - Yihan Sun, Guy E. Blelloch:

Implementing parallel and concurrent tree structures. 447-450 - Frank Mueller, Greg Byrd

, Patrick Dreher:
Programming quantum computers: a primer with IBM Q and D-Wave exercises. 451 - Dhabaleswar K. Panda, Ammar Ahmad Awan, Hari Subramoni:

High performance distributed deep learning: a beginner's guide. 452-454 - David Beckingsale, Richard D. Hornung, Tom Scogland, Arturo Vargas:

Performance portable C++ programming with RAJA. 455-456

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














