


default search action
PACT 2014: Edmonton, AB, Canada
- José Nelson Amaral, Josep Torrellas:

International Conference on Parallel Architectures and Compilation, PACT '14, Edmonton, AB, Canada, August 24-27, 2014. ACM 2014, ISBN 978-1-4503-2809-8
Keynote I
- Klara Nahrstedt:

Internet of mobile things: challenges and opportunities. 1-2
Best papers
- Nuno Diegues, Paolo Romano

, Luís E. T. Rodrigues:
Virtues and limitations of commodity hardware transactional memory. 3-14 - Jennifer B. Sartor, Wim Heirman

, Stephen M. Blackburn
, Lieven Eeckhout, Kathryn S. McKinley:
Cooperative cache scrubbing. 15-26 - Harshvardhan, Adam Fidel, Nancy M. Amato, Lawrence Rauchwerger:

KLA: a new algorithmic paradigm for parallel graph computations. 27-38 - Uday Bondhugula, Vinayaka Bandishti, Albert Cohen, Guillain Potron

, Nicolas Vasilache
:
Tiling and optimizing time-iterated computations on periodic domains. 39-50
Session 2A: cache hierarchies (I)
- Cheng-Chieh Huang, Vijay Nagarajan:

ATCache: reducing DRAM cache latency via a small SRAM tag cache. 51-60 - Lunkai Zhang, Dmitri B. Strukov

, Hebatallah Saadeldeen, Dongrui Fan
, Mingzhe Zhang, Diana Franklin:
SpongeDirectory: flexible sparse directories utilizing multi-level memristors. 61-74 - Gaurav Chadha, Scott A. Mahlke, Satish Narayanasamy

:
EFetch: optimizing instruction fetch for event-driven webapplications. 75-86 - Biswabandan Panda

, Shankar Balachandran:
XStream: cross-core spatial streaming based MLC prefetchers for parallel applications in CMPs. 87-98
Session 2B1: parallelism studies
- Cedomir Segulja, Tarek S. Abdelrahman:

What is the cost of weak determinism? 99-112 - Ehsan Fatehi, Paul Gratz

:
ILP and TLP in shared memory applications: a limit study. 113-126
Session 2B2: algorithms
- Wookeun Jung, Jongsoo Park, Jaejin Lee:

Versatile and scalable parallel histogram construction. 127-138 - Robert D. Cameron, Thomas C. Shermer, Arrvindh Shriraman, Kenneth S. Herdy, Dan Lin, Benjamin R. Hull, Meng Lin:

Bitwise data parallelism in regular expression matching. 139-150
Session 3A: gpus (I)
- Rashid Kaleem, Rajkishore Barik, Tatiana Shpeisman, Brian T. Lewis, Chunling Hu, Keshav Pingali:

Adaptive heterogeneous scheduling for integrated GPUs. 151-162 - James A. Jablin, Thomas B. Jablin, Onur Mutlu

, Maurice Herlihy:
Warp-aware trace scheduling for GPUs. 163-174 - Shin-Ying Lee, Carole-Jean Wu:

CAWS: criticality-aware warp scheduling for GPGPU workloads. 175-186
Session 3B: transactional memory
- Irina Calciu, Justin Gottschlich, Tatiana Shpeisman, Gilles Pokam, Maurice Herlihy:

Invyswell: a hybrid transactional memory for haswell's restricted transactional memory. 187-200 - Lihang Zhao, Jeffrey T. Draper:

Consolidated conflict detection for hardware transactional memory. 201-212 - Kaushik Ravichandran, Ada Gavrilovska, Santosh Pande

:
DeSTM: harnessing determinism in STMs for application development. 213-224
Session 4A: energy efficiency
- Qiumin Xu, Murali Annavaram

:
PATS: pattern aware scheduling and power gating for GPGPUs. 225-236 - Andrew Lukefahr, Shruti Padmanabha, Reetuparna Das

, Ronald G. Dreslinski, Thomas F. Wenisch, Scott A. Mahlke:
Heterogeneous microarchitectures trump voltage scaling for low-power cores. 237-250 - Hamid Reza Ghasemi, Nam Sung Kim:

RCS: runtime resource and core scaling for power-constrained multi-core processors. 251-262
Session 4B: runtime systems
- Sean Treichler, Michael Bauer, Alex Aiken

:
Realm: an event-based low-level runtime for distributed memory architectures. 263-276 - Matthias Diener

, Eduardo Henrique Molina da Cruz, Philippe Olivier Alexandre Navaux, Anselm Busse
, Hans-Ulrich Heiß:
kMAF: automatic kernel-level management of thread and data affinity. 277-288 - Kishore Kumar Pusukuri, Rajiv Gupta

, Laxmi N. Bhuyan:
Shuffling: a framework for lock contention aware thread scheduling for multicore multiprocessor systems. 289-300
Keynote II
- Bob Blainey:

Domain-specific models for innovation in analytics. 301-302
Session 5A1: compiler frameworks
- Jason Ansel, Shoaib Kamil, Kalyan Veeramachaneni, Jonathan Ragan-Kelley, Jeffrey Bosboom

, Una-May O'Reilly, Saman P. Amarasinghe
:
OpenTuner: an extensible framework for program autotuning. 303-316 - Rahul Garg, Laurie J. Hendren:

Velociraptor: an embedded compiler toolkit for numerical programs targeting CPUs and GPUs. 317-330
Session 5A2: scheduling
- Hao Wang

, Ripudaman Singh, Michael J. Schulte, Nam Sung Kim:
Memory scheduling towards high-throughput cooperative heterogeneous computing. 331-342 - Dragos Sbirlea, Zoran Budimlic, Vivek Sarkar:

Bounded memory scheduling of dynamic task graphs. 343-356
Session 6A: cache hierarchies (II)
- Wei Ding, Mahmut T. Kandemir, Diana R. Guttman

, Adwait Jog, Chita R. Das, Praveen Yedlapalli:
Trading cache hit rate for memory performance. 357-368 - Guilherme Piccoli, Henrique Nazaré Santos, Raphael Ernani Rodrigues, Christiane Pousa, Edson Borin, Fernando Magno Quintão Pereira:

Compiler support for selective page migration in NUMA architectures. 369-380 - Ying Ye, Richard West, Zhuoqun Cheng, Ye Li:

COLORIS: a dynamic cache partitioning system using page coloring. 381-392
Session 6B: performance tools and i/o
- Arnamoy Bhattacharyya, Torsten Hoefler:

PEMOGEN: automatic adaptive performance modeling during program runtime. 393-404 - Xu Liu, Kamal Sharma, John M. Mellor-Crummey

:
ArrayTool: a lightweight profiler to guide array regrouping. 405-416 - Arash Tavakkol, Mohammad Arjomand, Hamid Sarbazi-Azad:

Design for scalability in enterprise SSDs. 417-430
Session 7: gpus (II)
- Davoud Anoushe Jamshidi, Mehrzad Samadi, Scott A. Mahlke:

D2MA: accelerating coarse-grained data transfer for GPUs. 431-442 - Janghaeng Lee, Mehrzad Samadi, Scott A. Mahlke:

VAST: the illusion of a large memory space for GPUs. 443-454 - Alberto Magni, Christophe Dubach, Michael F. P. O'Boyle:

Automatic optimization of thread-coarsening for graphics processors. 455-466
Poster session
- Javier Cabezas, Lluís Vilanova

, Isaac Gelado, Thomas B. Jablin, Nacho Navarro, Wen-mei W. Hwu:
Automatic execution of single-GPU computations across multiple GPUs. 467-468 - Alexandros-Herodotos Haritatos, Georgios I. Goumas, Nikos Anastopoulos, Konstantinos Nikas, Kornilios Kourtis, Nectarios Koziris:

LCA: a memory link and cache-aware co-scheduling approach for CMPs. 469-470 - Simon Holmbacka, Sébastien Lafond

, Johan Lilius
:
A run-time power manager exploiting software parallelism. 471-472 - Magnus Jahre

:
Graph-based performance accounting for chip multiprocessor memory systems. 473-474 - Snehasish Kumar, Arrvindh Shriraman, Vijayalakshmi Srinivasan, Dan Lin, Jordon Phillips:

SQRL: hardware accelerator for collecting software data structures. 475-476 - Yulong Luo, Guangming Tan:

Optimizing stencil code via locality of computation. 477-478 - Deepak Majeti, Kuldeep S. Meel

, Rajkishore Barik, Vivek Sarkar:
ADHA: automatic data layout framework for heterogeneous architectures. 479-480 - William F. Ogilvie, Pavlos Petoumenos

, Zheng Wang
, Hugh Leather
:
Active learning accelerated automatic heuristic construction for parallel program mapping. 481-482 - Sreepathi Pai

, R. Govindarajan, Matthew J. Thazhuthaveetil:
Preemptive thread block scheduling with online structural runtime prediction for concurrent GPGPU kernels. 483-484 - Xiang Pan, Radu Teodorescu:

Using STT-RAM to enable energy-efficient near-threshold chip multiprocessors. 485-486 - Raj Parihar, Jacob Brock, Chen Ding

, Michael C. Huang
:
Protection and utilization in shared cache through rationing. 487-488 - Pushkar Ratnalikar, Arun Chauhan:

Automatic parallelism through macro dataflow in high-level array languages. 489-490 - Sudarshan Srinivasan, Nithesh Kurella, Israel Koren, Rance Rodrigues, Sandip Kundu:

A runtime support mechanism for fast mode switching of a self-morphing core for power efficiency. 491-492 - Bradley Thwaites, Gennady Pekhimenko, Hadi Esmaeilzadeh, Amir Yazdanbakhsh

, Onur Mutlu
, Jongse Park, Girish Mururu, Todd C. Mowry:
Rollback-free value prediction with approximate loads. 493-494 - Erik Tomusk, Christophe Dubach, Michael F. P. O'Boyle:

Measuring flexibility in single-ISA heterogeneous processors. 495-496 - Bo Wu, Guoyang Chen, Dong Li, Xipeng Shen, Jeffrey S. Vetter:

SM-centric transformation: circumventing hardware restrictions for flexible GPU scheduling. 497-498
Poster Board
- Serguei Makarov, Angela Demke Brown, Ashvin Goel:

An event-based language for dynamic binary translation frameworks. 499-500 - Peng Li, Jeremy Buhler

:
Improving performance of streaming applications with filtering and control messages. 501-502 - Jeeva Paudel, José Nelson Amaral:

Stratified sampling for even workload partitioning. 503-504 - Tejaswi Agarwal, Michela Becchi:

Design of a hybrid MPI-CUDA benchmark suite for CPU-GPU clusters. 505-506 - Sudharsan Jagathrakshakan, Venkata Kalyan Tavva, Madhu Mutyam

:
Data remapping for an energy efficient burst chop in DRAM memory systems. 507-508 - Alexandre Isoard:

Data-reuse optimizations for pipelined tiling with parametric tile sizes. 509-510 - Adam Fidel, Nancy M. Amato, Lawrence Rauchwerger:

From petascale to the pocket: Adaptively scaling parallel programs for mobile SoCs. 511-512 - Alessandro Fanfarillo

, Tobias Burnus, Valeria Cardellini
, Salvatore Filippone
, Dan Nagle, Damian W. I. Rouson
:
Coarrays in GNU Fortran. 513-514 - Thomas R. W. Scogland, Wu-Chun Feng:

Locality-aware memory association for multi-target worksharing in OpenMP. 515-516 - Harshvardhan, Nancy M. Amato, Lawrence Rauchwerger:

Processing big data graphs on memory-restricted systems. 517-518

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














