


Остановите войну!
for scientists:


default search action
Torsten Hoefler
Torsten Höfler
Person information

- affiliation: ETH Zürich
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2023
- [j55]Torsten Hoefler, Thomas Häner, Matthias Troyer:
Disentangling Hype from Practicality: On Realistically Achieving Quantum Advantage. Commun. ACM 66(5): 82-87 (2023) - [j54]Maciej Besta, Marc Fischer, Vasiliki Kalavri, Michael Kapralov, Torsten Hoefler:
Practice of Streaming Processing of Dynamic Graphs: Concepts, Models, and Systems. IEEE Trans. Parallel Distributed Syst. 34(6): 1860-1876 (2023) - [c238]Tal Ben-Nun, Berke Ates, Alexandru Calotoiu, Torsten Hoefler:
Bridging Control-Centric and Data-Centric Optimization. CGO 2023: 173-185 - [c237]Tal Ben-Nun
, Lukas Gianinazzi
, Torsten Hoefler
, Yishai Oltchik:
Maximum Flows in Parametric Graph Templates. CIAC 2023: 97-111 - [i135]Niels Gleinig, Tal Ben-Nun, Torsten Hoefler:
A Theory of I/O-Efficient Sparse Neural Network Inference. CoRR abs/2301.01048 (2023) - [i134]Satoshi Matsuoka, Jens Domke, Mohamed Wahib, Aleksandr Drozd, Torsten Hoefler:
Myths and Legends in High-Performance Computing. CoRR abs/2301.02432 (2023) - [i133]Jinfan Chen, Shigang Li, Ran Guo, Jinhui Yuan, Torsten Hoefler:
AutoDDL: Automatic Distributed Deep Learning with Asymptotically Optimal Communication. CoRR abs/2301.06813 (2023) - [i132]Niels Gleinig, Tobias Rohner, Torsten Hoefler:
Approximate Reversible Circuits for NISQ-Era Quantum Computers. CoRR abs/2302.01066 (2023) - [i131]Torsten Hoefler, Duncan Roweth, Keith D. Underwood, Bob Alverson, Mark Griswold, Vahid Tabatabaee, Mohan Kalkunte, Surendra Anubolu, Siyan Shen, Abdul Kabbani, Moray McLaren, Steve Scott:
Datacenter Ethernet and RDMA: Issues at Hyperscale. CoRR abs/2302.03337 (2023) - [i130]Kartik Lakhotia, Laura Monroe, Kelly Isham, Maciej Besta, Nils Blach, Torsten Hoefler, Fabrizio Petrini:
PolarStar: Expanding the Scalability Horizon of Diameter-3 Networks. CoRR abs/2302.07217 (2023) - [i129]Lukas Trümper, Tal Ben-Nun, Philipp Schaad, Alexandru Calotoiu, Torsten Hoefler:
Performance Embeddings: A Similarity-based Approach to Automatic Performance Optimization. CoRR abs/2303.08142 (2023) - [i128]Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Saleh Ashkboos, Torsten Hoefler:
STen: Productive and Efficient Sparsity in PyTorch. CoRR abs/2304.07613 (2023) - [i127]Kazuki Osawa, Satoki Ishikawa, Rio Yokota, Shigang Li, Torsten Hoefler:
ASDL: A Unified Interface for Gradient Preconditioning in PyTorch. CoRR abs/2305.04684 (2023) - [i126]Thomas Benz, Michael Rogenmoser, Paul Scheffler, Samuel Riedel, Alessandro Ottaviano, Andreas Kurth, Torsten Hoefler, Luca Benini:
A High-performance, Energy-efficient Modular DMA Engine Architecture. CoRR abs/2305.05240 (2023) - [i125]Paul Scheffler, Florian Zaruba, Fabian Schuiki, Torsten Hoefler, Luca Benini:
Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra. CoRR abs/2305.05559 (2023) - [i124]Marcin Copik, Roman Böhringer, Alexandru Calotoiu, Torsten Hoefler:
FMI: Fast and Cheap Message Passing for Serverless Functions. CoRR abs/2305.08763 (2023) - [i123]Maciej Besta, Robert Gerstenberger, Marc Fischer, Michal Podstawski, Jürgen Müller, Nils Blach, Berke Egeli, George Mitenkov, Wojciech Chlapek, Marek Michalewicz, Torsten Hoefler:
High-Performance Graph Databases That Are Portable, Programmable, and Scale to Hundreds of Thousands of Cores. CoRR abs/2305.11162 (2023) - 2022
- [j53]Torsten Hoefler, Ariel Hendel, Duncan Roweth:
The Convergence of Hyperscale Data Center and High-Performance Computing Networks. Computer 55(7): 29-37 (2022) - [j52]Torsten Hoefler:
Benchmarking Data Science: 12 Ways to Lie With Statistics and Performance on Parallel Computers. Computer 55(8): 49-56 (2022) - [j51]Marcin Copik
, Tobias Grosser, Torsten Hoefler, Paolo Bientinesi, Benjamin Berkels:
Work-Stealing Prefix Scan: Addressing Load Imbalance in Large-Scale Image Registration. IEEE Trans. Parallel Distributed Syst. 33(3): 523-535 (2022) - [c236]Konstantin Taranov, Benjamin Rothenberger, Daniele De Sensi, Adrian Perrig, Torsten Hoefler:
NeVerMore: Exploiting RDMA Mistakes in NVMe-oF Storage Applications. CCS 2022: 2765-2778 - [c235]Andrea Cossettini, Konstantin Taranov, Christian Vogt, Michele Magno
, Torsten Hoefler, Luca Benini:
A RDMA Interface for Ultra-Fast Ultrasound Data-Streaming over an Optical Link. DATE 2022: 80-83 - [c234]Niels Gleinig, Torsten Hoefler:
Circuits for Measurement Based Quantum State Preparation. DATE 2022: 328-333 - [c233]Andrea Biagioni, Paolo Cretaro, Ottorino Frezza, Francesca Lo Cicero, Alessandro Lonardo, Michele Martinelli, Pier Stanislao Paolucci
, Elena Pastorelli, Francesco Simula, Matteo Turisini, Piero Vicini
, Roberto Ammendola, Pascale Bernier-Bruna, Claire Chen, Said Derradji, Stéphane Guez, Pierre-Axel Lagadec, Gregoire Pichon, Etienne Walter, Gaetan De Gassowski, Matthieu Hautreaux, Stephane Mathieu, Gilles Moreau, Marc Pérache, Hugo Taboada, Torsten Hoefler, Timo Schneider, Matteo Barnaba, Giuseppe Piero Brandino, Francesco De Giorgi, Matteo Poggi, Iakovos Mavroidis, Yannis Papaefstathiou, Nikolaos Tampouratzis, Benjamin Kalisch, Ulrich Krackhardt, Mondrian Nuessle, Pantelis Xirouchakis, Vangelis Mageiropoulos, Michalis Gianioudis, Harisis Loukas, Aggelos Ioannou, Nikos Kallimanis, Nikos Chrysos, Manolis Katevenis, Wolfang Frings, Dominik Gottwald, Felime Guimaraes, Max Holicki, Volker Marx, Yannik Muller, Carsten Clauss, Hugo Falter, Xu Huang, Jennifer Lopez Barillao, Thomas Moschny, Simon Pickartz, Francisco J. Alfaro, Jesús Escudero-Sahuquillo, Pedro Javier García, Francisco J. Quiles, José L. Sánchez, Adrián Castelló, Jose Duro, María Engracia Gómez, Enrique S. Quintana-Ortí, Julio Sahuquillo, Eugenio Stabile:
RED-SEA: Network Solution for Exascale Architectures. DSD 2022: 712-719 - [c232]Shiyi Cao, Salvatore Di Girolamo, Torsten Hoefler:
Accelerating Data Serialization/Deserialization Protocols with In-Network Compute. ExaMPI@SC 2022: 22-30 - [c231]Johannes de Fine Licht, Christopher A. Pattison, Alexandros Nikolaos Ziogas, David Simmons-Duffin, Torsten Hoefler:
Fast Arbitrary Precision Floating Point on FPGA. FCCM 2022: 1-9 - [c230]Carl-Johannes Johnsen
, Tiziano De Matteis
, Tal Ben-Nun, Johannes de Fine Licht, Torsten Hoefler:
Temporal Vectorization: A Compiler Approach to Automatic Multi-Pumping. ICCAD 2022: 85 - [c229]Bryan A. Plummer, Nikoli Dryden, Julius Frost, Torsten Hoefler, Kate Saenko:
Neural Parameter Allocation Search. ICLR 2022 - [c228]Larissa Schmid
, Marcin Copik, Alexandru Calotoiu, Dominik Werle, Andreas Reiter, Michael Selzer, Anne Koziolek, Torsten Hoefler:
Performance-detective: automatic deduction of cheap and accurate performance models. ICS 2022: 3:1-3:13 - [c227]Alexandru Calotoiu, Tal Ben-Nun, Grzegorz Kwasniewski, Johannes de Fine Licht, Timo Schneider, Philipp Schaad
, Torsten Hoefler:
Lifting C semantics for dataflow optimization. ICS 2022: 17:1-17:13 - [c226]Oliver Rausch, Tal Ben-Nun, Nikoli Dryden, Andrei Ivanov, Shigang Li, Torsten Hoefler:
A data-centric optimization framework for machine learning. ICS 2022: 36:1-36:13 - [c225]Andrei Lascu, Alastair F. Donaldson, Tobias Grosser, Torsten Hoefler:
Metamorphic Fuzzing of C++ Libraries. ICST 2022: 35-46 - [c224]Niels Gleinig, Maciej Besta, Torsten Hoefler:
I/O-Optimal Cache-Oblivious Sparse Matrix-Sparse Matrix Multiplication. IPDPS 2022: 36-46 - [c223]András Strausz, Flavio Vella
, Salvatore Di Girolamo, Maciej Besta, Torsten Hoefler:
Asynchronous Distributed-Memory Triangle Counting and LCC with RMA Caching. IPDPS 2022: 291-301 - [c222]Maciej Besta, Raphael Grob, Cesare Miglioli, Nicola Bernold, Grzegorz Kwasniewski, Gabriel Gjini, Raghavendra Kanakagiri, Saleh Ashkboos, Lukas Gianinazzi, Nikoli Dryden, Torsten Hoefler:
Motif Prediction with Graph Neural Networks. KDD 2022: 35-45 - [c221]Maciej Besta, Patrick Iff, Florian Scheidl, Kazuki Osawa, Nikoli Dryden, Michal Podstawski, Tiancheng Chen, Torsten Hoefler:
Neural Graph Databases. LoG 2022: 31 - [c220]Saleh Ashkboos, Langwen Huang, Nikoli Dryden, Tal Ben-Nun, Peter Dueben, Lukas Gianinazzi, Luca Kummer, Torsten Hoefler:
ENS-10: A Dataset For Post-Processing Ensemble Weather Forecasts. NeurIPS 2022 - [c219]Nikoli Dryden, Torsten Hoefler:
Spatial Mixture-of-Experts. NeurIPS 2022 - [c218]Shigang Li, Torsten Hoefler:
Near-optimal sparse allreduce for distributed deep learning. PPoPP 2022: 135-149 - [c217]Salvatore Di Girolamo, Daniele De Sensi, Konstantin Taranov, Milos Malesevic, Maciej Besta, Timo Schneider, Severin Kistler, Torsten Hoefler:
Building Blocks for Network-Accelerated Distributed File Systems. SC 2022: 10:1-10:14 - [c216]Torsten Hoefler, Tommaso Bonato, Daniele De Sensi, Salvatore Di Girolamo, Shigang Li, Marco Heddes, Jon Belk, Deepak Goel, Miguel Castro, Steve Scott:
HammingMesh: A Network Topology for Large-Scale Deep Learning. SC 2022: 11:1-11:18 - [c215]Kartik Lakhotia, Maciej Besta, Laura Monroe, Kelly Isham, Patrick Iff, Torsten Hoefler, Fabrizio Petrini:
PolarFly: A Cost-Effective and Flexible Low-Diameter Topology. SC 2022: 12:1-12:15 - [c214]Alexandros Nikolaos Ziogas, Grzegorz Kwasniewski, Tal Ben-Nun, Timo Schneider, Torsten Hoefler:
Deinsum: Practically I/O Optimal Multi-Linear Algebra. SC 2022: 25:1-25:15 - [c213]Shigang Li, Kazuki Osawa, Torsten Hoefler:
Efficient Quantized Sparse Matrix Operations on Tensor Cores. SC 2022: 37:1-37:15 - [c212]Maciej Besta, Cesare Miglioli, Paolo Sylos Labini, Jakub Tetek, Patrick Iff, Raghavendra Kanakagiri, Saleh Ashkboos, Kacper Janda, Michal Podstawski, Grzegorz Kwasniewski, Niels Gleinig, Flavio Vella, Onur Mutlu, Torsten Hoefler:
ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations. SC 2022: 43:1-43:17 - [c211]Philipp Schaad
, Tal Ben-Nun, Torsten Hoefler:
Boosting Performance Optimization with Interactive Data Movement Visualization. SC 2022: 64:1-64:16 - [c210]Tal Ben-Nun, Linus Groner, Florian Deconinck, Tobias Wicky, Eddie Davis, Johann Dahm, Oliver Elbert, Rhea George, Jeremy McGibbon, Lukas Trümper, Elynn Wu, Oliver Fuhrer, Thomas C. Schulthess, Torsten Hoefler:
Productive Performance Engineering for Weather and Climate Modeling with Python. SC 2022: 73:1-73:14 - [c209]Konstantin Taranov, Steve Byan, Virendra J. Marathe, Torsten Hoefler:
KafkaDirect: Zero-copy Data Access for Apache Kafka over RDMA Networks. SIGMOD Conference 2022: 2191-2204 - [c208]Niels Gleinig, Torsten Hoefler:
The Red-Blue Pebble Game on Trees and DAGs with Large Input. SIROCCO 2022: 135-153 - [i122]Shigang Li, Torsten Hoefler:
Near-Optimal Sparse Allreduce for Distributed Deep Learning. CoRR abs/2201.07598 (2022) - [i121]Konstantin Taranov, Benjamin Rothenberger, Daniele De Sensi, Adrian Perrig, Torsten Hoefler:
NeVerMore: Exploiting RDMA Mistakes in NVMe-oF Storage Applications. CoRR abs/2202.08080 (2022) - [i120]András Strausz, Flavio Vella, Salvatore Di Girolamo, Maciej Besta, Torsten Hoefler:
Asynchronous Distributed-Memory Triangle Counting and LCC with RMA Caching. CoRR abs/2202.13976 (2022) - [i119]Marcin Copik, Alexandru Calotoiu, Konstantin Taranov, Torsten Hoefler:
FaasKeeper: a Blueprint for Serverless Services. CoRR abs/2203.14859 (2022) - [i118]Johannes de Fine Licht, Christopher A. Pattison, Alexandros Nikolaos Ziogas, David Simmons-Duffin, Torsten Hoefler:
Fast Arbitrary Precision Floating Point on FPGA. CoRR abs/2204.06256 (2022) - [i117]Tal Ben-Nun, Linus Groner, Florian Deconinck, Tobias Wicky, Eddie Davis, Johann Dahm, Oliver Elbert, Rhea George, Jeremy McGibbon, Lukas Trümper, Elynn Wu, Oliver Fuhrer, Thomas C. Schulthess, Torsten Hoefler:
Productive Performance Engineering for Weather and Climate Modeling with Python. CoRR abs/2205.04148 (2022) - [i116]Lukas Gianinazzi, Tal Ben-Nun, Saleh Ashkboos, Yves Baumann, Piotr Luczynski, Torsten Hoefler:
The spatial computer: A model for energy-efficient parallel computation. CoRR abs/2205.04934 (2022) - [i115]Maciej Besta, Torsten Hoefler:
Parallel and Distributed Graph Neural Networks: An In-Depth Concurrency Analysis. CoRR abs/2205.09702 (2022) - [i114]Alexandros Nikolaos Ziogas, Grzegorz Kwasniewski, Tal Ben-Nun, Timo Schneider, Torsten Hoefler:
Deinsum: Practically I/O Optimal Multilinear Algebra. CoRR abs/2206.08301 (2022) - [i113]Salvatore Di Girolamo, Daniele De Sensi, Konstantin Taranov, Milos Malesevic, Maciej Besta, Timo Schneider, Severin Kistler, Torsten Hoefler:
Building Blocks for Network-Accelerated Distributed File Systems. CoRR abs/2206.10007 (2022) - [i112]Saleh Ashkboos, Langwen Huang, Nikoli Dryden, Tal Ben-Nun, Peter Dueben, Lukas Gianinazzi, Luca Kummer, Torsten Hoefler:
ENS-10: A Dataset For Post-Processing Ensemble Weather Forecast. CoRR abs/2206.14786 (2022) - [i111]Philipp Schaad
, Tal Ben-Nun, Torsten Hoefler:
Boosting Performance Optimization with Interactive Data Movement Visualization. CoRR abs/2207.07433 (2022) - [i110]Kartik Lakhotia, Maciej Besta, Laura Monroe, Kelly Isham, Patrick Iff, Torsten Hoefler, Fabrizio Petrini:
PolarFly: A Cost-Effective and Flexible Low-Diameter Topology. CoRR abs/2208.01695 (2022) - [i109]Maciej Besta, Cesare Miglioli, Paolo Sylos Labini, Jakub Tetek, Patrick Iff, Raghavendra Kanakagiri, Saleh Ashkboos, Kacper Janda, Michal Podstawski, Grzegorz Kwasniewski, Niels Gleinig, Flavio Vella, Onur Mutlu, Torsten Hoefler:
ProbGraph: High-Performance and High-Accuracy Graph Mining with Probabilistic Set Representations. CoRR abs/2208.11469 (2022) - [i108]Torsten Hoefler, Tommaso Bonato, Daniele De Sensi, Salvatore Di Girolamo, Shigang Li, Marco Heddes, Jon Belk, Deepak Goel, Miguel Castro, Steve Scott:
HammingMesh: A Network Topology for Large-Scale Deep Learning. CoRR abs/2209.01346 (2022) - [i107]Andrei Ivanov, Benjamin Rothenberger, Arnaud Dethise, Marco Canini, Torsten Hoefler, Adrian Perrig
:
SAGE: Software-based Attestation for GPU Execution. CoRR abs/2209.03125 (2022) - [i106]Shigang Li, Kazuki Osawa, Torsten Hoefler:
Efficient Quantized Sparse Matrix Operations on Tensor Cores. CoRR abs/2209.06979 (2022) - [i105]Maciej Besta, Patrick Iff, Florian Scheidl, Kazuki Osawa, Nikoli Dryden, Michal Podstawski, Tiancheng Chen, Torsten Hoefler:
Neural Graph Databases. CoRR abs/2209.09732 (2022) - [i104]Carl-Johannes Johnsen, Tiziano De Matteis, Tal Ben-Nun, Johannes de Fine Licht, Torsten Hoefler:
Temporal Vectorization: A Compiler Approach to Automatic Multi-Pumping. CoRR abs/2210.04598 (2022) - [i103]Langwen Huang, Torsten Hoefler:
Compressing multidimensional weather and climate data into neural networks. CoRR abs/2210.12538 (2022) - [i102]Daniele De Sensi, Tiziano De Matteis, Konstantin Taranov, Salvatore Di Girolamo, Tobias Rahn, Torsten Hoefler:
Noise in the Clouds: Influence of Network Performance Variability on Application Scalability. CoRR abs/2210.15315 (2022) - [i101]Elias Frantar, Saleh Ashkboos, Torsten Hoefler, Dan Alistarh:
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transformers. CoRR abs/2210.17323 (2022) - [i100]Michael E. Beverland, Prakash Murali, Matthias Troyer, Krysta M. Svore, Torsten Hoefler, Vadym Kliuchnikov, Guang Hao Low, Mathias Soeken, Aarthi Sundaram, Alexander Vaschillo:
Assessing requirements to scale to practical quantum advantage. CoRR abs/2211.07629 (2022) - [i99]Nikoli Dryden, Torsten Hoefler:
Spatial Mixture-of-Experts. CoRR abs/2211.13491 (2022) - [i98]Patrick Iff, Maciej Besta, Matheus A. Cavalcante, Tim Fischer, Luca Benini, Torsten Hoefler:
Sparse Hamming Graph: A Customizable Network-on-Chip Topology. CoRR abs/2211.13980 (2022) - [i97]Patrick Iff, Maciej Besta, Matheus A. Cavalcante, Tim Fischer, Luca Benini, Torsten Hoefler:
HexaMesh: Scaling to Hundreds of Chiplets with an Optimized Chiplet Arrangement. CoRR abs/2211.13989 (2022) - [i96]Kazuki Osawa, Shigang Li, Torsten Hoefler:
PipeFisher: Efficient Training of Large Language Models Using Pipelining and Fisher Information Matrices. CoRR abs/2211.14133 (2022) - [i95]Konstantin Taranov, Fabian Fischer, Torsten Hoefler:
Efficient RDMA Communication Protocols. CoRR abs/2212.09134 (2022) - [i94]Johannes de Fine Licht, Tiziano De Matteis, Tal Ben-Nun, Andreas Kuster, Oliver Rausch, Manuel Burger
, Carl-Johannes Johnsen, Torsten Hoefler:
Python FPGA Programming with Data-Centric Multi-Level Design. CoRR abs/2212.13768 (2022) - 2021
- [j50]Torsten Hoefler, Dan Alistarh, Tal Ben-Nun, Nikoli Dryden, Alexandra Peste:
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks. J. Mach. Learn. Res. 22: 241:1-241:124 (2021) - [j49]Arjun Pitchanathan, Christian Ulmann, Michel Weber, Torsten Hoefler, Tobias Grosser:
FPL: fast Presburger arithmetic through transprecision. Proc. ACM Program. Lang. 5(OOPSLA): 1-26 (2021) - [j48]Daniele De Sensi
, Tiziano De Matteis
, Konstantin Taranov
, Salvatore Di Girolamo
, Tobias Rahn
, Torsten Hoefler
:
Noise in the Clouds: Influence of Network Performance Variability on Application Scalability. Proc. ACM Meas. Anal. Comput. Syst. 6(3): 49:1-49:27 (2021) - [j47]Maciej Besta, Zur Vonarburg-Shmaria, Yannick Schaffner, Leonardo Schwarz, Grzegorz Kwasniewski, Lukas Gianinazzi
, Jakub Beránek, Kacper Janda, Tobias Holenstein, Sebastian Leisinger, Peter Tatkowski, Esref Özdemir, Adrian Balla, Marcin Copik, Philipp Lindenberger, Marek Konieczny
, Onur Mutlu, Torsten Hoefler:
GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra. Proc. VLDB Endow. 14(11): 1922-1936 (2021) - [j46]Edgar Solomonik, James Demmel, Torsten Hoefler:
Communication Lower Bounds of Bilinear Algorithms for Symmetric Tensor Contractions. SIAM J. Sci. Comput. 43(5): A3328-A3356 (2021) - [j45]Tobias Gysi, Christoph Müller, Oleksandr Zinenko
, Stephan Herhut, Eddie Davis, Tobias Wicky, Oliver Fuhrer, Torsten Hoefler, Tobias Grosser:
Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth Compiler for GPU-accelerated Climate Simulation. ACM Trans. Archit. Code Optim. 18(4): 51:1-51:23 (2021) - [j44]Fabian Schuiki
, Florian Zaruba
, Torsten Hoefler, Luca Benini
:
Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores. IEEE Trans. Computers 70(2): 212-227 (2021) - [j43]Florian Zaruba
, Fabian Schuiki
, Torsten Hoefler, Luca Benini
:
Snitch: A Tiny Pseudo Dual-Issue Processor for Area and Energy Efficient Execution of Floating-Point Intensive Workloads. IEEE Trans. Computers 70(11): 1845-1860 (2021) - [j42]Maciej Besta
, Jens Domke
, Marcel Schneider, Marek Konieczny
, Salvatore Di Girolamo, Timo Schneider, Ankit Singla, Torsten Hoefler:
High-Performance Routing With Multipathing and Path Diversity in Ethernet and HPC Networks. IEEE Trans. Parallel Distributed Syst. 32(4): 943-959 (2021) - [j41]Johannes de Fine Licht
, Maciej Besta
, Simon Meierhans
, Torsten Hoefler
:
Transformations of High-Level Synthesis Codes for High-Performance Computing. IEEE Trans. Parallel Distributed Syst. 32(5): 1014-1029 (2021) - [j40]Shigang Li
, Tal Ben-Nun
, Giorgi Nadiradze, Salvatore Di Girolamo, Nikoli Dryden, Dan Alistarh, Torsten Hoefler:
Breaking (Global) Barriers in Parallel Stochastic Optimization With Wait-Avoiding Group Averaging. IEEE Trans. Parallel Distributed Syst. 32(7): 1725-1739 (2021) - [c207]Dan Graur, Rodrigo Bruno
, Joschka Bischoff, Marcel Rieser, Wolfgang Scherr, Torsten Hoefler, Gustavo Alonso:
Hermes: Enabling efficient large-scale simulation in MATSim. ANT/EDI40 2021: 635-641 - [c206]Johannes de Fine Licht
, Andreas Kuster, Tiziano De Matteis
, Tal Ben-Nun, Dominic Hofer, Torsten Hoefler:
StencilFlow: Mapping Large Stencil Programs to Distributed Spatial Computing Systems. CGO 2021: 315-326 - [c205]Niels Gleinig, Torsten Hoefler:
An Efficient Algorithm for Sparse Quantum State Preparation. DAC 2021: 433-438 - [c204]Paul Scheffler, Florian Zaruba, Fabian Schuiki, Torsten Hoefler, Luca Benini:
Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra. DATE 2021: 1787-1792 - [c203]Chris Cummins, Zacharias V. Fisches, Tal Ben-Nun, Torsten Hoefler, Michael F. P. O'Boyle, Hugh Leather:
ProGraML: A Graph-based Program Representation for Data Flow Analysis and Compiler Optimizations. ICML 2021: 2244-2253 - [c202]Alexandros Nikolaos Ziogas, Tal Ben-Nun, Timo Schneider, Torsten Hoefler:
NPBench: a benchmarking suite for high-performance NumPy. ICS 2021: 63-74 - [c201]Marcus Ritter
, Alexander Geiß
, Johannes Wehrstein, Alexandru Calotoiu, Thorsten Reimann, Torsten Hoefler, Felix Wolf:
Noise-Resilient Empirical Performance Modeling with Deep Neural Networks. IPDPS 2021: 23-34 - [c200]Salvatore Di Girolamo, Andreas Kurth, Alexandru Calotoiu, Thomas Benz, Timo Schneider, Jakub Beránek
, Luca Benini, Torsten Hoefler:
A RISC-V in-network accelerator for flexible high-performance low-power packet processing. ISCA 2021: 958-971 - [c199]Maciej Besta, Raghavendra Kanakagiri, Grzegorz Kwasniewski, Rachata Ausavarungnirun, Jakub Beránek
, Konstantinos Kanellopoulos, Kacper Janda, Zur Vonarburg-Shmaria, Lukas Gianinazzi, Ioana Stefan, Juan Gómez-Luna, Jakub Golinowski, Marcin Copik, Lukas Kapp-Schwoerer, Salvatore Di Girolamo, Nils Blach, Marek Konieczny
, Onur Mutlu, Torsten Hoefler:
SISA: Set-Centric Instruction Set Architecture for Graph Mining on Processing-in-Memory Systems. MICRO 2021: 282-297 - [c198]Marcin Copik, Grzegorz Kwasniewski, Maciej Besta, Michal Podstawski, Torsten Hoefler:
SeBS: a serverless benchmark suite for function-as-a-service computing. Middleware 2021: 64-78 - [c197]Andrei Ivanov, Nikoli Dryden, Tal Ben-Nun, Shigang Li, Torsten Hoefler:
Data Movement Is All You Need: A Case Study on Optimizing Transformers. MLSys 2021 - [c196]Marcin Copik, Alexandru Calotoiu, Tobias Grosser, Nicolas Wicki, Felix Wolf, Torsten Hoefler:
Extracting clean performance models from tainted programs. PPoPP 2021: 403-417 - [c195]Grzegorz Kwasniewski, Tal Ben-Nun, Alexandros Nikolaos Ziogas, Timo Schneider, Maciej Besta, Torsten Hoefler:
On the parallel I/O optimality of linear algebra kernels: near-optimal LU factorization. PPoPP 2021: 463-464 - [c194]Thomas Häner, Damian S. Steiger, Torsten Hoefler, Matthias Troyer:
Distributed quantum computing with QMPI. SC 2021: 16 - [c193]Shigang Li, Torsten Hoefler:
Chimera: efficiently training large-scale neural networks with bidirectional pipelines. SC 2021: 27 - [c192]Daniele De Sensi, Salvatore Di Girolamo, Saleh Ashkboos, Shigang Li, Torsten Hoefler:
Flare: flexible in-network allreduce. SC 2021: 35 - [c191]Grzegorz Kwasniewski, Marko Kabic, Tal Ben-Nun, Alexandros Nikolaos Ziogas, Jens Eirik Saethre, André Gaillard, Timo Schneider, Maciej Besta, Anton Kozhevnikov, Joost VandeVondele, Torsten Hoefler:
On the parallel I/O optimality of linear algebra kernels: near-optimal matrix factorizations. SC 2021: 70 - [c190]Nikoli Dryden, Roman Böhringer, Tal Ben-Nun, Torsten Hoefler:
Clairvoyant prefetching for distributed machine learning I/O. SC 2021: 92 - [c189]Alexandros Nikolaos Ziogas, Timo Schneider, Tal Ben-Nun, Alexandru Calotoiu, Tiziano De Matteis
, Johannes de Fine Licht, Luca Lavarini, Torsten Hoefler:
Productivity, portability, performance: data-centric Python. SC 2021: 95 - [c188]Konstantin Taranov, Salvatore Di Girolamo, Torsten Hoefler:
CoRM: Compactable Remote Memory over RDMA. SIGMOD Conference 2021: 1811-1824 - [c187]Lukas Gianinazzi
, Maciej Besta, Yannick Schaffner, Torsten Hoefler:
Parallel Algorithms for Finding Large Cliques in Sparse Graphs. SPAA 2021: 243-253 - [c186]