


default search action
Parallel Computing, Volume 39
Volume 39, Number 1, January 2013
- Dana Jacobsen, Inanc Senocak

:
Multi-level parallelism for incompressible flow computations on GPU clusters. 1-20 - Masha Sosonkina, Layne T. Watson, Nicholas R. Radcliffe, Raphael T. Haftka, Michael W. Trosset:

Adjusting process count on demand for petascale global optimization. 21-35 - Diego Andrade, Basilio B. Fraguela

, Ramon Doallo:
Accurate prediction of the behavior of multithreaded applications in shared caches. 36-57 - Orlando Ayala

, Lian-Ping Wang
:
Parallel implementation and scalability analysis of 3D Fast Fourier Transform using 2D domain decomposition. 58-77
Volume 39, Number 2, February 2013
- Abhinav Sarje, Srinivas Aluru:

All-pairs computations on many-core graphics processors. 79-93 - Ferit Büyükkeçeci, Omar Awile

, Ivo F. Sbalzarini
:
A portable OpenCL implementation of generic particle-mesh and mesh-particle interpolation in 2D and 3D. 94-111
Volume 39, Number 3, March 2013
- Mark W. Krentel:

Libmonitor: A tool for first-party monitoring. 114-119 - Nick Rutar, Jeffrey K. Hollingsworth:

Software techniques for negating skid and approximating cache miss measurements. 120-131 - Marc-André Hermanns

, Sriram Krishnamoorthy, Felix Wolf:
A scalable infrastructure for the performance analysis of passive target synchronization. 132-145 - Michael O. Lam

, Jeffrey K. Hollingsworth, G. W. Stewart:
Dynamic floating-point cancellation detection. 146-155 - Barry Rountree, Todd Gamblin, Bronis R. de Supinski, Martin Schulz

, David K. Lowenthal, Guy Cobb, Henry M. Tufo:
Parallelizing heavyweight debugging tools with mpiecho. 156-166 - Joshua D. Goehner, Dorian C. Arnold, Dong H. Ahn, Gregory L. Lee, Bronis R. de Supinski, Matthew P. LeGendre, Barton P. Miller, Martin Schulz

:
LIBI: A framework for bootstrapping extreme scale software systems. 167-176
Volume 39, Numbers 4-5, April - May 2013
- Sen Su, Jian Li, Qingjia Huang, Xiao Huang, Kai Shuang, Jie Wang:

Cost-efficient task scheduling for executing large programs in the cloud. 177-188 - George Teodoro, Tony Pan

, Tahsin M. Kurç, Jun Kong, Lee A. D. Cooper, Joel H. Saltz:
Efficient irregular wavefront propagation algorithms on hybrid CPU-GPU machines. 189-211 - Jack J. Dongarra, Mathieu Faverge, Thomas Hérault

, Mathias Jacquelin
, Julien Langou
, Yves Robert
:
Hierarchical QR factorization algorithms for multi-core clusters. 212-232 - Wagner Kolberg, Pedro de B. Marcos, Julio C. S. dos Anjos

, Alexandre K. S. Miyazaki, Cláudio Fernando Resin Geyer, Luciana Arantes:
MRSG - A MapReduce simulator over SimGrid. 233-244
Volume 39, Numbers 6-7, June - July 2013
- Andrew V. Terekhov

:
A fast parallel algorithm for solving block-tridiagonal systems of linear equations including the domain decomposition method. 245-258 - Christian Obrecht

, Frédéric Kuznik
, Bernard Tourancheau, Jean-Jacques Roux
:
Scalable lattice Boltzmann solvers for CUDA GPU clusters. 259-270 - Yuefan Deng

, Peng Zhang
, Carlos Marques, Reid Powell, Li Zhang:
Analysis of Linpack and power efficiencies of the world's TOP500 supercomputers. 271-279 - Ichitaro Yamazaki, Hiroto Tadano, Tetsuya Sakurai, Tsutomu Ikegami:

Performance comparison of parallel eigensolvers based on a contour integral method and a Lanczos method. 280-290
Volume 39, Number 8, August 2013
- Yang Wang, Paul Lu:

DDS: A deadlock detection-based scheduling algorithm for workflow computations in HPC systems with storage constraints. 291-305 - A. Sandroos, Ilja Honkonen

, Sebastian von Alfthan, Minna Palmroth
:
Multi-GPU simulations of Vlasov's equation using Vlasiator. 306-318 - Oliver Fortmeier, H. Martin Bücker

, B. O. Fagginger Auer, Rob H. Bisseling:
A new metric enabling an exact hypergraph model for the communication volume in distributed-memory parallel applications. 319-335 - Harald Servat

, Germán Llort
, Kevin A. Huck
, Judit Giménez
, Jesús Labarta
:
Framework for a productive performance optimization. 336-353
Volume 39, Number 9, September 2013
- Fangyang Shen, Mei Yang, Maurizio Palesi:

Guest Editors' Introduction to the Special Issue on "Novel On-Chip Parallel Architectures and Software Support". 355-356 - Sandeep Pande, Fearghal Morgan, Gerard J. M. Smit, Tom M. Bruintjes, Jochem H. Rutgers, Brian McGinley

, Seamus Cawley, Jim Harkin
, Liam McDaid:
Fixed latency on-chip interconnect for hardware spiking neural network architectures. 357-371 - Junghee Lee

, Chrysostomos Nicopoulos
, Hyung Gyu Lee, Jongman Kim:
Sharded Router: A novel on-chip router architecture employing bandwidth sharding and stealing. 372-388 - Michael Opoku Agyeman

, Ali Ahmadinia, Alireza Shahrabi
:
Efficient routing techniques in heterogeneous 3D Networks-on-Chip. 389-407 - Xiaohang Wang, Peng Liu

, Mei Yang, Yingtao Jiang:
Avoiding request-request type message-dependent deadlocks in networks-on-chips. 408-423 - Ashkan Beyranvand Nejad, Anca Mariana Molnos, Matias Escudero Martinez, Kees Goossens:

A hardware/software platform for QoS bridging over multi-chip NoC-based systems. 424-441 - José M. Andión

, Manuel Arenaz, Gabriel Rodríguez
, Juan Touriño
:
A novel compiler support for automatic parallelization on multicore systems. 442-460 - Jiyang Yu, Peng Liu

, Weidong Wang, Chunming Huang, Jie Yang, Yingtao Jiang, Qingdong Yao:
An efficient protocol with synchronization accelerator for multi-processor embedded systems. 461-474 - Carlos H. Gonzalez, Basilio B. Fraguela

:
A framework for argument-based task synchronization with automatic detection of dependencies. 475-489 - Guiyuan Jiang, Jigang Wu, Jizhou Sun:

Efficient reconfiguration algorithms for communication-aware three-dimensional processor arrays. 490-503 - Giovanni Mariani, Gianluca Palermo

, Vittorio Zaccaria, Cristina Silvano
:
ARTE: An Application-specific Run-Time managEment framework for multi-cores based on queuing models. 504-519 - Jingweijia Tan, Yang Yi, Fangyang Shen, Xin Fu:

Modeling and characterizing GPGPU reliability in the presence of soft errors. 520-532
Volume 39, Number 10, October 2013
- Marcin Krotkiewski, Marcin Dabrowski

:
Efficient 3D stencil computations using CUDA. 533-548 - Jaume Joven, Andrea Marongiu, Federico Angiolini, Luca Benini

, Giovanni De Micheli:
An integrated, programming model-driven framework for NoC-QoS support in cluster-based embedded many-cores. 549-566 - Laiping Zhao, Yizhi Ren, Kouichi Sakurai:

Reliable workflow scheduling with less resource redundancy. 567-585 - Libo Huang, Nong Xiao, Zhiying Wang, Yongwen Wang, Ming-che Lai:

Efficient multimedia coprocessor with enhanced SIMD engines for exploiting ILP and DLP. 586-602 - Dimitris Saougkos, George Manis:

Self adaptive run time scheduling for the automatic parallelization of loops with the C2μTC/SL compiler. 603-614 - Agustín C. Caminero

, Antonio Robles-Gómez
, Salvador Ros
, Roberto Hernández, Llanos Tobarra
:
P2P-based resource discovery in dynamic grids allowing multi-attribute and range queries. 615-637 - Xiaoliang Wan, Guang Lin

:
Hybrid parallel computing of minimum action method. 638-651
Volume 39, Number 11, November 2013
- Gregory Tauer, Rakesh Nagi

:
A map-reduce lagrangian heuristic for multidimensional assignment problems with decomposable costs. 653-668 - Gihan R. Mudalige

, Mike B. Giles
, Jeyarajan Thiyagalingam
, I. Z. Reguly, Carlo Bertolli, Paul H. J. Kelly, Anne E. Trefethen:
Design and initial performance of a high-level unstructured mesh framework on heterogeneous parallel systems. 669-692 - Hameed Hussain, Saif Ur Rehman Malik

, Abdul Hameed
, Samee Ullah Khan
, Gage Bickler, Nasro Min-Allah
, Muhammad Bilal Qureshi, Limin Zhang, Yongji Wang, Nasir Ghani, Joanna Kolodziej
, Albert Y. Zomaya
, Cheng-Zhong Xu
, Pavan Balaji, Abhinav Vishnu, Frédéric Pinel, Johnatan E. Pecero
, Dzmitry Kliazovich, Pascal Bouvry
, Hongxiang Li, Lizhe Wang
, Dan Chen, Ammar Rayes:
A survey on resource allocation in high performance distributed computing systems. 709-736 - Hoang-Vu Dang, Bertil Schmidt

:
CUDA-enabled Sparse Matrix-Vector Multiplication on GPUs using atomic operations. 737-750
Volume 39, Number 12, December 2013
- Yong Chen

, Pavan Balaji, Abhinav Vishnu:
Special issue on programming models, systems software, and tools for High-End Computing. 751-752 - Wei Tang, Dongxu Ren, Zhiling Lan, Narayan Desai:

Toward balanced and sustainable job scheduling for production supercomputers. 753-768 - Mark K. Gardner, Paul Sathre, Wu-chun Feng, Gabriel Martinez:

Characterizing the challenges and evaluating the efficacy of a CUDA-to-OpenCL translator. 769-786 - Zhiyi Huang, Kai-Cheung Leung:

Performance evaluation of View-Oriented Transactional Memory. 787-801 - Ekow J. Otoo, Gideon Nimako

, Daniel Ohene-Kwofie:
Chunked extendible dense arrays for scientific data storage. 802-818 - Shannon Steinfadt

:
Fine-grained parallel implementations for SWAMP+ Smith-Waterman alignment. 819-833 - Jie Shen, Jianbin Fang

, Henk J. Sips, Ana Lucia Varbanescu:
An application-centric evaluation of OpenCL on multi-core CPUs. 834-850 - Hisham Mohamed, Stéphane Marchand-Maillet:

MRO-MPI: MapReduce overlapping using MPI and an optimized data exchange policy. 851-866 - Omer Erdil Albayrak, Ismail Akturk, Ozcan Ozturk:

Improving application behavior on heterogeneous manycore systems through kernel mapping. 867-878 - Alexander Reinefeld, Robert Döbbelin, Thorsten Schütt:

Analyzing the performance of SMP memory allocators with iterative MapReduce applications. 879-889

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














