default search action
Samuel Williams 0001
Person information
- affiliation: Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- affiliation (PhD 2008): University of California at Berkeley, CA, USA
Other persons with the same name
- Samuel Williams — disambiguation page
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [j26]Nan Ding, Pieter Maris, Hai Ah Nam, Taylor L. Groves, Muaaz Gul Awan, LeAnn Lindsey, Christopher S. Daley, Oguz Selvitopi, Leonid Oliker, Nicholas J. Wright, Samuel Williams:
Evaluating the potential of disaggregated memory systems for HPC applications. Concurr. Comput. Pract. Exp. 36(19) (2024) - [c72]Mahesh Lakshminarasimhan, Mary W. Hall, Samuel Williams, Oscar Antepara:
BrickDL: Graph-Level Optimizations for DNNs with Fine-Grained Data Blocking on GPUs. ICPP 2024: 576-586 - [i9]Zhe Bai, Xishuo Wei, William Tang, Leonid Oliker, Zhihong Lin, Samuel Williams:
FTL: Transfer Learning Nonlinear Plasma Dynamic Transitions in Low Dimensional Embeddings via Deep Neural Networks. CoRR abs/2404.17466 (2024) - [i8]Xuan Jiang, Raja Sengupta, James Demmel, Samuel Williams:
LPSim: Large Scale Multi-GPU Parallel Computing based Regional Scale Traffic Simulation Framework. CoRR abs/2406.08496 (2024) - 2023
- [c71]Yang Liu, Nan Ding, Piyush Sao, Samuel Williams, Xiaoye Sherry Li:
Unified Communication Optimization Strategies for Sparse Triangular Solver on CPU and GPU Clusters. SC 2023: 50:1-50:15 - [c70]Oscar Antepara, Samuel Williams, Hans Johansen, Tuowen Zhao, Samantha Hirsch, Priya Goyal, Mary W. Hall:
Performance Portability Evaluation of Blocked Stencil Computations on GPUs. SC Workshops 2023: 1005-1018 - [c69]Nan Ding, Muhammad Haseeb, Taylor L. Groves, Samuel Williams:
Evaluating the Performance of One-sided Communication on CPUs and GPUs. SC Workshops 2023: 1059-1069 - [c68]Oscar Antepara, Samuel Williams, Scott Kruger, Torrin Bechtel, Joseph McClenaghan, Lang Lao:
Performance-Portable GPU Acceleration of the EFIT Tokamak Plasma Equilibrium Reconstruction Code. SC Workshops 2023: 1939-1948 - [i7]Nan Ding, Pieter Maris, Hai Ah Nam, Taylor L. Groves, Muaaz Gul Awan, LeAnn Lindsey, Christopher S. Daley, Oguz Selvitopi, Leonid Oliker, Nicholas J. Wright, Samuel Williams:
Evaluating the Potential of Disaggregated Memory Systems for HPC applications. CoRR abs/2306.04014 (2023) - 2022
- [j25]Nan Ding, Muaaz G. Awan, Samuel Williams:
Instruction Roofline: An insightful visual performance model for GPUs. Concurr. Comput. Pract. Exp. 34(20) (2022) - [j24]Tan Nguyen, Colin MacLean, Marco Siracusa, Douglas Doerfler, Nicholas J. Wright, Samuel Williams:
FPGA-based HPC accelerators: An evaluation on performance and energy efficiency. Concurr. Comput. Pract. Exp. 34(20) (2022) - [j23]Marco Siracusa, Emanuele Del Sozzo, Marco Rabozzi, Lorenzo Di Tucci, Samuel Williams, Donatella Sciuto, Marco Domenico Santambrogio:
A Comprehensive Methodology to Optimize FPGA Designs via the Roofline Model. IEEE Trans. Computers 71(8): 1903-1915 (2022) - [c67]Benjamin Sepanski, Tuowen Zhao, Hans Johansen, Samuel Williams:
Maximizing Performance Through Memory Hierarchy-Driven Data Layout Transformations. MCHPC@SC 2022: 1-10 - [c66]Taylor L. Groves, Christopher S. Daley, Rahulkumar Gayatri, Hai Ah Nam, Nan Ding, Lenny Oliker, Nicholas J. Wright, Samuel Williams:
A Methodology for Evaluating Tightly-integrated and Disaggregated Accelerated Architectures. PMBS@SC 2022: 71-81 - [i6]Sridutt Bhalachandra, Brian Austin, Samuel Williams, Nicholas J. Wright:
Understanding the Impact of Input Entropy on FPU, CPU, and GPU Power. CoRR abs/2212.08805 (2022) - 2021
- [c65]Nan Ding, Yang Liu, Samuel Williams, Xiaoye S. Li:
A Message-Driven, Multi-GPU Parallel Sparse Triangular Solver. ACDA 2021: 147-159 - [c64]Douglas Doerfler, Farzad Fatollahi-Fard, Colin MacLean, Tan Nguyen, Samuel Williams, Nicholas J. Wright, Marco Siracusa:
Experiences Porting the SU3_Bench Microbenchmark to the Intel Arria 10 and Xilinx Alveo U280 FPGAs. IWOCL 2021: 1:1-1:9 - [c63]Khaled Z. Ibrahim, Tan Nguyen, Hai Ah Nam, Wahid Bhimji, Steven Farrell, Leonid Oliker, Michael Rowan, Nicholas J. Wright, Samuel Williams:
Architectural Requirements for Deep Learning Workloads in HPC Environments. PMBS 2021: 7-17 - [c62]Tuowen Zhao, Mary W. Hall, Hans Johansen, Samuel Williams:
Improving communication by optimizing on-node data movement with data layout. PPoPP 2021: 304-317 - [c61]Charlene Yang, Yunsong Wang, Thorsten Kurth, Steven Farrell, Samuel Williams:
Hierarchical Roofline Performance Analysis for Deep Learning Applications. SAI (2) 2021: 473-491 - 2020
- [j22]Wenjing Ma, Yulong Ao, Chao Yang, Samuel Williams:
Solving a trillion unknowns per second with HPGMG on Sunway TaihuLight. Clust. Comput. 23(2): 493-507 (2020) - [j21]Charlene Yang, Thorsten Kurth, Samuel Williams:
Hierarchical Roofline analysis for GPUs: Accelerating performance optimization for the NERSC-9 Perlmutter system. Concurr. Comput. Pract. Exp. 32(20) (2020) - [c60]Marco Siracusa, Marco Rabozzi, Emanuele Del Sozzo, Lorenzo Di Tucci, Samuel Williams, Marco D. Santambrogio:
A CAD-based methodology to optimize HLS code via the Roofline model. ICCAD 2020: 116:1-116:9 - [c59]Anastasiia Butko, George Michelogiannakis, Samuel Williams, Costin Iancu, David Donofrio, John Shalf, Jonathan Carter, Irfan Siddiqi:
Understanding Quantum Control Processor Capabilities and Limitations through Circuit Characterization. ICRC 2020: 66-75 - [c58]Christopher S. Daley, Hadia Ahmed, Samuel Williams, Nicholas J. Wright:
A Case Study of Porting HPGMG from CUDA to OpenMP Target Offload. IWOMP 2020: 37-51 - [c57]Tan Nguyen, Samuel Williams, Marco Siracusa, Colin MacLean, Douglas Doerfler, Nicholas J. Wright:
The Performance and Energy Efficiency Potential of FPGAs in Scientific Computing. PMBS@SC 2020: 8-19 - [c56]Taylor L. Groves, Ben Brock, Yuxin Chen, Khaled Z. Ibrahim, Lenny Oliker, Nicholas J. Wright, Samuel Williams, Katherine A. Yelick:
Performance Trade-offs in GPU Communication: A Study of Host and Device-initiated Approaches. PMBS@SC 2020: 126-137 - [c55]Nan Ding, Samuel Williams, Yang Liu, Xiaoye S. Li:
Leveraging One-Sided Communication for Sparse Triangular Solvers. PP 2020: 93-105 - [c54]Yunsong Wang, Charlene Yang, Steven Farrell, Yan Zhang, Thorsten Kurth, Samuel Williams:
Time-Based Roofline for Deep Learning Performance Analysis. DLS@SC 2020: 10-19 - [c53]Jonathan R. Madsen, Muaaz G. Awan, Hugo Brunie, Jack Deslippe, Rahulkumar Gayatri, Leonid Oliker, Yunsong Wang, Charlene Yang, Samuel Williams:
Timemory: Modular Performance Analysis for HPC. ISC 2020: 434-452 - [i5]Yunsong Wang, Charlene Yang, Steven Farrell, Yan Zhang, Thorsten Kurth, Samuel Williams:
Time-Based Roofline for Deep Learning Performance Analysis. CoRR abs/2009.04598 (2020) - [i4]Yunsong Wang, Charlene Yang, Steven Farrell, Thorsten Kurth, Samuel Williams:
Hierarchical Roofline Performance Analysis for Deep Learning Applications. CoRR abs/2009.05257 (2020)
2010 – 2019
- 2019
- [j20]Bei Wang, Stéphane Ethier, William M. Tang, Khaled Z. Ibrahim, Kamesh Madduri, Samuel Williams, Leonid Oliker:
Modern gyrokinetic particle-in-cell simulation of fusion plasmas on top supercomputers. Int. J. High Perform. Comput. Appl. 33(1) (2019) - [j19]Weiqun Zhang, Ann S. Almgren, Vincent E. Beckner, John B. Bell, Johannes P. Blaschke, Cy P. Chan, Marcus Day, Brian Friesen, Kevin Gott, Daniel T. Graves, Maximilian Katz, Andrew Myers, Tan Nguyen, Andrew Nonaka, Michele Rosso, Samuel Williams, Michael Zingale:
AMReX: a framework for block-structured adaptive mesh refinement. J. Open Source Softw. 4(37): 1370 (2019) - [c52]Khaled Z. Ibrahim, Samuel Williams, Leonid Oliker:
Performance Analysis of GPU Programming Models Using the Roofline Scaling Trajectories. Bench 2019: 3-19 - [c51]Nan Ding, Samuel Williams:
An Instruction Roofline Model for GPUs. PMBS@SC 2019: 7-18 - [c50]Tuowen Zhao, Protonu Basu, Samuel Williams, Mary W. Hall, Hans Johansen:
Exploiting reuse and vectorization in blocked stencil computations on CPUs and GPUs. SC 2019: 52:1-52:44 - 2018
- [c49]Khaled Z. Ibrahim, Samuel Williams, Leonid Oliker:
Roofline Scaling Trajectories: A Method for Parallel Application and Architectural Performance Analysis. HPCS 2018: 350-358 - [c48]Tuowen Zhao, Mary W. Hall, Protonu Basu, Samuel Williams, Hans Johansen:
SIMD code generation for stencils on brick decompositions. PPoPP 2018: 423-424 - [c47]Hongzhang Shan, Samuel Williams, Calvin W. Johnson:
Improving MPI Reduction Performance for Manycore Architectures with OpenMP and Data Compression. PMBS@SC 2018: 1-11 - [c46]Tuomas Koskela, Zakhar Matveev, Charlene Yang, Adetokunbo Adedoyin, Roman Belenov, Philippe Thierry, Zhengji Zhao, Rahulkumar Gayatri, Hongzhang Shan, Leonid Oliker, Jack Deslippe, Ron Green, Samuel Williams:
A Novel Multi-level Integrated Roofline Model Approach for Performance Characterization. ISC 2018: 226-245 - 2017
- [j18]Nicholas Chaimov, Khaled Z. Ibrahim, Samuel Williams, Costin Iancu:
Reaching bandwidth saturation using transparent injection parallelization. Int. J. High Perform. Comput. Appl. 31(5): 405-421 (2017) - [j17]Khaled Z. Ibrahim, Evgeny Epifanovsky, Samuel Williams, Anna I. Krylov:
Cross-scale efficient tensor contractions for coupled cluster computations through multiple programming model backends. J. Parallel Distributed Comput. 106: 92-105 (2017) - [j16]Protonu Basu, Samuel Williams, Brian van Straalen, Leonid Oliker, Phillip Colella, Mary W. Hall:
Compiler-based code generation and autotuning for geometric multigrid on GPU-accelerated supercomputers. Parallel Comput. 64: 50-64 (2017) - [j15]Hasan Metin Aktulga, Md. Afibuzzaman, Samuel Williams, Aydin Buluç, Meiyue Shao, Chao Yang, Esmond G. Ng, Pieter Maris, James P. Vary:
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations. IEEE Trans. Parallel Distributed Syst. 28(6): 1550-1563 (2017) - [c45]Nathan Zhang, Michael B. Driscoll, Charles Markley, Samuel Williams, Protonu Basu, Armando Fox:
Snowflake: A Lightweight Portable Stencil DSL. IPDPS Workshops 2017: 795-804 - [c44]Bryce Adelstein-Lelbach, Hans Johansen, Samuel Williams:
Simultaneously Solving Swarms of Small Sparse Systems on SIMD Silicon. IPDPS Workshops 2017: 1128-1137 - [c43]Hongzhang Shan, Samuel Williams, Calvin W. Johnson, Kenneth S. McElvain:
A Locality-Based Threading Algorithm for the Configuration-Interaction Method. IPDPS Workshops 2017: 1178-1187 - [c42]Philip C. Roth, Hongzhang Shan, David Riegner, Nikolas Antolin, Sarat Sreepathi, Leonid Oliker, Samuel Williams, Shirley Moore, Wolfgang Windl:
Performance analysis and optimization of the RAMPAGE metal alloy potential generation software. SEPS@SPLASH 2017: 11-20 - [c41]Thorsten Kurth, William Arndt, Taylor Barnes, Brandon Cook, Jack Deslippe, Douglas Doerfler, Brian Friesen, Yun (Helen) He, Tuomas Koskela, Mathieu Lobet, Tareq M. Malas, Leonid Oliker, Andrey Ovsyannikov, Samuel Williams, Woo-Sun Yang, Zhengji Zhao:
Analyzing Performance of Selected NESAP Applications on the Cori HPC System. ISC Workshops 2017: 334-347 - [c40]Brandon Cook, Thorsten Kurth, Brian Austin, Samuel Williams, Jack Deslippe:
Performance Variability on Xeon Phi. ISC Workshops 2017: 419-429 - 2016
- [j14]Pieter Ghysels, Xiaoye S. Li, François-Henry Rouet, Samuel Williams, Artem Napov:
An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling. SIAM J. Sci. Comput. 38(5) (2016) - [j13]Ariful Azad, Grey Ballard, Aydin Buluç, James Demmel, Laura Grigori, Oded Schwartz, Sivan Toledo, Samuel Williams:
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication. SIAM J. Sci. Comput. 38(6) (2016) - [c39]Zhaoyi Meng, Alice Koniges, Yun (Helen) He, Samuel Williams, Thorsten Kurth, Brandon Cook, Jack Deslippe, Andrea L. Bertozzi:
OpenMP Parallelization and Optimization of Graph-Based Machine Learning Algorithms. IWOMP 2016: 17-31 - [c38]Hongzhang Shan, Samuel Williams, Yili Zheng, Weiqun Zhang, Bei Wang, Stéphane Ethier, Zhengji Zhao:
Experiences of Applying One-Sided Communication to Nearest-Neighbor Communication. PAW@SC 2016: 17-24 - [c37]Taylor Barnes, Brandon Cook, Jack Deslippe, Douglas Doerfler, Brian Friesen, Yun (Helen) He, Thorsten Kurth, Tuomas Koskela, Mathieu Lobet, Tareq M. Malas, Leonid Oliker, Andrey Ovsyannikov, Abhinav Sarje, Jean-Luc Vay, Henri Vincenti, Samuel Williams, Pierre Carrier, Nathan Wichmann, Marcus Wagner, Paul R. C. Kent, Christopher Kerr, John M. Dennis:
Evaluating and Optimizing the NERSC Workload on Knights Landing. PMBS@SC 2016: 43-53 - [c36]William M. Tang, Bei Wang, Stéphane Ethier, Grzegorz Kwasniewski, Torsten Hoefler, Khaled Z. Ibrahim, Kamesh Madduri, Samuel Williams, Leonid Oliker, Carlos Rosales-Fernandez, Timothy J. Williams:
Extreme scale plasma turbulence simulations on top supercomputers worldwide. SC 2016: 502-513 - [c35]Douglas Doerfler, Jack Deslippe, Samuel Williams, Leonid Oliker, Brandon Cook, Thorsten Kurth, Mathieu Lobet, Tareq M. Malas, Jean-Luc Vay, Henri Vincenti:
Applying the Roofline Performance Model to the Intel Xeon Phi Knights Landing Processor. ISC Workshops 2016: 339-353 - 2015
- [j12]Didem Unat, Cy P. Chan, Weiqun Zhang, Samuel Williams, John Bachan, John B. Bell, John Shalf:
ExaSAT: An exascale co-design tool for performance modeling. Int. J. High Perform. Comput. Appl. 29(2): 209-232 (2015) - [j11]Adam Lugowski, Shoaib Kamil, Aydin Buluç, Samuel Williams, Erika Duriakova, Leonid Oliker, Armando Fox, John R. Gilbert:
Parallel processing of filtered queries in attributed semantic graphs. J. Parallel Distributed Comput. 79-80: 115-131 (2015) - [c34]Abhinav Sarje, Sukhyun Song, Douglas Jacobsen, Kevin A. Huck, Jeffrey K. Hollingsworth, Allen D. Malony, Samuel Williams, Leonid Oliker:
Parallel Performance Optimizations on Unstructured Mesh-based Simulations. ICCS 2015: 2016-2025 - [c33]Protonu Basu, Mary W. Hall, Samuel Williams, Brian van Straalen, Leonid Oliker, Phillip Colella:
Compiler-Directed Transformation for Higher-Order Stencils. IPDPS 2015: 313-323 - [c32]Alex Druinsky, Pieter Ghysels, Xiaoye S. Li, Osni Marques, Samuel Williams, Andrew T. Barker, Delyan Kalchev, Panayot S. Vassilevski:
Comparative Performance Analysis of Coarse Solvers for Algebraic Multigrid on Multicore and Manycore Architectures. PPAM (1) 2015: 116-127 - [c31]Hongzhang Shan, Samuel Williams, Wibe de Jong, Leonid Oliker:
Thread-level parallelization and optimization of NWChem for the Intel MIC architecture. PMAM@PPoPP 2015: 58-67 - [c30]Nicholas Chaimov, Khaled Z. Ibrahim, Samuel Williams, Costin Iancu:
Exploiting communication concurrency on high performance computing systems. PMAM@PPoPP 2015: 132-143 - [c29]Hongzhang Shan, Samuel Williams, Calvin W. Johnson, Kenneth S. McElvain, W. Erich Ormand:
Parallel implementation and performance optimization of the configuration-interaction method. SC 2015: 9:1-9:12 - [i3]Pieter Ghysels, Xiaoye S. Li, François-Henry Rouet, Samuel Williams, Artem Napov:
An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling. CoRR abs/1502.07405 (2015) - [i2]Ariful Azad, Grey Ballard, Aydin Buluç, James Demmel, Laura Grigori, Oded Schwartz, Sivan Toledo, Samuel Williams:
Exploiting Multiple Levels of Parallelism in Sparse Matrix-Matrix Multiplication. CoRR abs/1510.00844 (2015) - [i1]Bei Wang, Stéphane Ethier, William M. Tang, Khaled Z. Ibrahim, Kamesh Madduri, Samuel Williams, Leonid Oliker:
Modern Gyrokinetic Particle-In-Cell Simulation of Fusion Plasmas on Top Supercomputers. CoRR abs/1510.05546 (2015) - 2014
- [c28]Khaled Z. Ibrahim, Samuel W. Williams, Evgeny Epifanovsky, Anna I. Krylov:
Analysis and tuning of libtensor framework on multicore architectures. HiPC 2014: 1-10 - [c27]George Michelogiannakis, Alexander Williams, Samuel Williams, John Shalf:
Collective memory transfers for multi-core chips. ICS 2014: 343-352 - [c26]Samuel Williams, Mike Lijewski, Ann S. Almgren, Brian van Straalen, Erin C. Carson, Nicholas Knight, James Demmel:
s-Step Krylov Subspace Methods as Bottom Solvers for Geometric Multigrid. IPDPS 2014: 1149-1158 - [c25]Hasan Metin Aktulga, Aydin Buluç, Samuel Williams, Chao Yang:
Optimizing Sparse Matrix-Multiple Vectors Multiplication for Nuclear Configuration Interaction Calculations. IPDPS 2014: 1213-1222 - [c24]Hongzhang Shan, Amir Kamil, Samuel Williams, Yili Zheng, Katherine A. Yelick:
Evaluation of PGAS Communication Paradigms with Geometric Multigrid. PGAS 2014: 8:1-8:12 - [c23]Yu Jung Lo, Samuel Williams, Brian van Straalen, Terry J. Ligocki, Matthew J. Cordery, Nicholas J. Wright, Mary W. Hall, Leonid Oliker:
Roofline Model Toolkit: A Practical Tool for Architectural and Program Analysis. PMBS@SC 2014: 129-148 - 2013
- [j10]Khaled Z. Ibrahim, Kamesh Madduri, Samuel Williams, Bei Wang, Stéphane Ethier, Leonid Oliker:
Analysis and optimization of gyrokinetic toroidal simulations on homogenous and heterogenous platforms. Int. J. High Perform. Comput. Appl. 27(4): 454-473 (2013) - [c22]Protonu Basu, Anand Venkat, Mary W. Hall, Samuel Williams, Brian van Straalen, Leonid Oliker:
Compiler generation and autotuning of communication-avoiding operators for geometric multigrid. HiPC 2013: 452-461 - [c21]Aydin Buluç, Erika Duriakova, Armando Fox, John R. Gilbert, Shoaib Kamil, Adam Lugowski, Leonid Oliker, Samuel Williams:
High-Productivity and High-Performance Analysis of Filtered Semantic Graphs. IPDPS 2013: 237-248 - [c20]Christopher D. Krieger, Michelle Mills Strout, Catherine Olschanowsky, Andrew Stone, Stephen M. Guzik, Xinfeng Gao, Carlo Bertolli, Paul H. J. Kelly, Gihan R. Mudalige, Brian van Straalen, Samuel Williams:
Loop Chaining: A Programming Abstraction for Balancing Locality and Parallelism. IPDPS Workshops 2013: 375-384 - [c19]Bei Wang, Stéphane Ethier, William M. Tang, Timothy J. Williams, Khaled Z. Ibrahim, Kamesh Madduri, Samuel Williams, Leonid Oliker:
Kinetic turbulence simulations at extreme scale on leadership-class systems. SC 2013: 82:1-82:12 - 2012
- [j9]Kamesh Madduri, Jimmy Su, Samuel Williams, Leonid Oliker, Stéphane Ethier, Katherine A. Yelick:
Optimization of Parallel Particle-to-Grid Interpolation on Leading Multicore Platforms. IEEE Trans. Parallel Distributed Syst. 23(10): 1915-1922 (2012) - [c18]Aydin Buluç, Armando Fox, John R. Gilbert, Shoaib Kamil, Adam Lugowski, Leonid Oliker, Samuel Williams:
High-performance analysis of filtered semantic graphs. PACT 2012: 463-464 - [c17]Samuel Williams, Dhiraj D. Kalamkar, Amik Singh, Anand M. Deshpande, Brian van Straalen, Mikhail Smelyanskiy, Ann S. Almgren, Pradeep Dubey, John Shalf, Leonid Oliker:
Optimization of geometric multigrid for emerging multi- and manycore processors. SC 2012: 96 - [c16]Bei Wang, Stéphane Ethier, William M. Tang, Khaled Z. Ibrahim, Kamesh Madduri, Samuel W. Williams, Leonid Oliker, Timothy J. Williams:
Abstract: Advances in Gyrokinetic Particle in Cell Simulation for Fusion Plasmas to Extreme Scale. SC Companion 2012: 1439-1440 - [c15]Bei Wang, Stéphane Ethier, William M. Tang, Khaled Z. Ibrahim, Kamesh Madduri, Samuel W. Williams, Leonid Oliker, Timothy J. Williams:
Poster: Advances in Gyrokinetic Particle in Cell Simulation for Fusion Plasmas to Extreme Scale. SC Companion 2012: 1441 - 2011
- [j8]Kamesh Madduri, Eun-Jin Im, Khaled Z. Ibrahim, Samuel Williams, Stéphane Ethier, Leonid Oliker:
Gyrokinetic particle-in-cell optimization on emerging multi- and manycore platforms. Parallel Comput. 37(9): 501-520 (2011) - [c14]Aydin Buluç, Samuel Williams, Leonid Oliker, James Demmel:
Reduced-Bandwidth Multithreaded Algorithms for Sparse Matrix-Vector Multiplication. IPDPS 2011: 721-733 - [c13]Kamesh Madduri, Khaled Z. Ibrahim, Samuel Williams, Eun-Jin Im, Stéphane Ethier, John Shalf, Leonid Oliker:
Gyrokinetic toroidal simulations on leading multi- and manycore HPC systems. SC 2011: 23:1-23:12 - [c12]Samuel Williams, Leonid Oliker, Jonathan Carter, John Shalf:
Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning. SC 2011: 55:1-55:12 - [c11]Jens Krueger, David Donofrio, John Shalf, Marghoob Mohiyuddin, Samuel Williams, Leonid Oliker, Franz-Josef Pfreundt:
Hardware/software co-design for energy-efficient seismic modeling. SC 2011: 73:1-73:12 - 2010
- [c10]Aparna Chandramowlishwaran, Samuel Williams, Leonid Oliker, Ilya Lashuk, George Biros, Richard W. Vuduc:
Optimizing and tuning the fast multipole method for state-of-the-art multicore architectures. IPDPS 2010: 1-12 - [c9]Shoaib Kamil, Cy P. Chan, Leonid Oliker, John Shalf, Samuel Williams:
An auto-tuning framework for parallel multicore stencil computations. IPDPS 2010: 1-12 - [p2]Samuel Williams, Nathan Bell, Jee Whan Choi, Michael Garland, Leonid Oliker, Richard Vu:
Sparse Matrix-Vector Multiplication on Multicore and Accelerators. Scientific Computing with Multicore and Accelerators 2010: 83-109 - [p1]Kaushik Datta, Samuel Williams, Vasily Volkov, Jonathan Carter, Leonid Oliker, John Shalf, Katherine A. Yelick:
Auto-Tuning Stencil Computations on Multicore and Accelerators. Scientific Computing with Multicore and Accelerators 2010: 219-253
2000 – 2009
- 2009
- [j7]Samuel Williams, Andrew Waterman, David A. Patterson:
Roofline: an insightful visual performance model for multicore architectures. Commun. ACM 52(4): 65-76 (2009) - [j6]Shujia Zhou, Daniel C. Duffy, Tom Clune, Max Suarez, Samuel Williams, Milton Halem:
The impact of IBM Cell technology on the programming paradigm in the context of computer systems for climate and weather models. Concurr. Comput. Pract. Exp. 21(17): 2176-2186 (2009) - [j5]Samuel Williams, Jonathan Carter, Leonid Oliker, John Shalf, Katherine A. Yelick:
Optimization of a lattice Boltzmann computation on state-of-the-art multicore platforms. J. Parallel Distributed Comput. 69(9): 762-777 (2009) - [j4]Samuel Williams, Leonid Oliker, Richard W. Vuduc, John Shalf, Katherine A. Yelick, James Demmel:
Optimization of sparse matrix-vector multiplication on emerging multicore platforms. Parallel Comput. 35(3): 178-194 (2009) - [j3]Kaushik Datta, Shoaib Kamil, Samuel Williams, Leonid Oliker, John Shalf, Katherine A. Yelick:
Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors. SIAM Rev. 51(1): 129-159 (2009) - [c8]Joseph Gebis, Leonid Oliker, John Shalf, Samuel Williams, Katherine A. Yelick:
Improving Memory Subsystem Performance Using ViVA: Virtual Vector Architecture. ARCS 2009: 146-158 - [c7]Kamesh Madduri, Samuel Williams, Stéphane Ethier, Leonid Oliker, John Shalf, Erich Strohmaier, Katherine A. Yelick:
Memory-efficient optimization of Gyrokinetic particle-to-grid interpolation for multicore processors. SC 2009 - [c6]Marghoob Mohiyuddin, Mark Murphy, Leonid Oliker, John Shalf, John Wawrzynek, Samuel Williams:
A design methodology for domain-optimized power-efficient supercomputing. SC 2009 - 2008
- [c5]Samuel Williams, Jonathan Carter, Leonid Oliker, John Shalf, Katherine A. Yelick:
Lattice Boltzmann simulation optimization on leading multicore platforms. IPDPS 2008: 1-14 - [c4]Kaushik Datta, Mark Murphy, Vasily Volkov, Samuel Williams, Jonathan Carter, Leonid Oliker, David A. Patterson, John Shalf, Katherine A. Yelick:
Stencil computation optimization and auto-tuning on state-of-the-art multicore architectures. SC 2008: 4 - 2007
- [j2]Samuel Williams, John Shalf, Leonid Oliker, Shoaib Kamil, Parry Husbands, Katherine A. Yelick:
Scientific Computing Kernels on the Cell Processor. Int. J. Parallel Program. 35(3): 263-298 (2007) - [c3]Samuel Williams, Leonid Oliker, Richard W. Vuduc, John Shalf, Katherine A. Yelick, James Demmel:
Optimization of sparse matrix-vector multiplication on emerging multicore platforms. SC 2007: 38 - 2006
- [c2]Shoaib Kamil, Kaushik Datta, Samuel Williams, Leonid Oliker, John Shalf, Katherine A. Yelick:
Implicit and explicit optimizations for stencil computations. Memory System Performance and Correctness 2006: 51-60 - [c1]Samuel Williams, John Shalf, Leonid Oliker, Shoaib Kamil, Parry Husbands, Katherine A. Yelick:
The potential of the cell processor for scientific computing. Conf. Computing Frontiers 2006: 9-20 - 2001
- [j1]Christoforos E. Kozyrakis, David Judd, Joseph Gebis, Samuel Williams, David A. Patterson, Katherine A. Yelick:
Hardware/compiler codevelopment for an embedded media processor. Proc. IEEE 89(11): 1694-1709 (2001)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:25 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint