


default search action
ACM Transactions on Architecture and Code Optimization, Volume 21
Volume 21, Number 1, March 2024
- Longfei Luo

, Dingcui Yu
, Yina Lv
, Liang Shi
:
Critical Data Backup with Hybrid Flash-Based Consumer Devices. 1:1-1:23 - Peng Chen

, Hui Chen
, Weichen Liu
, Linbo Long
, Wanli Chang
, Nan Guan
:
DAG-Order: An Order-Based Dynamic DAG Scheduling for Real-Time Networks-on-Chip. 2:1-2:24 - Zhang Jiang

, Ying Chen
, Xiaoli Gong
, Jin Zhang
, Wenwen Wang
, Pen-Chung Yew
:
JiuJITsu: Removing Gadgets with Safe Register Allocation for JIT Code Generation. 3:1-3:26 - Hayfa Tayeb

, Ludovic Paillat
, Bérenger Bramas:
Autovesk: Automatic Vectorized Code Generation from Unstructured Static Kernels Using Graph Transformations. 4:1-4:25 - Xueying Wang

, Guangli Li
, Zhen Jia
, Xiaobing Feng
, Yida Wang
:
Fast Convolution Meets Low Precision: Exploring Efficient Quantized Winograd Convolution on Modern CPUs. 5:1-5:26 - Hao Fan, Yiliang Ye

, Shadi Ibrahim
, Zhuo Huang
, Xingru Li
, Weibin Xue
, Song Wu
, Chen Yu
, Xuanhua Shi
, Hai Jin
:
QoS-pro: A QoS-enhanced Transaction Processing Framework for Shared SSDs. 6:1-6:25 - Yunping Zhao

, Sheng Ma
, Hengzhu Liu, Libo Huang
, Yi Dai
:
SAC: An Ultra-Efficient Spin-based Architecture for Compressed DNNs. 7:1-7:26 - Tong-Yu Liu

, Jianmei Guo
, Bo Huang
:
Efficient Cross-platform Multiplexing of Hardware Performance Counters via Adaptive Grouping. 8:1-8:26 - Lei Liu

, Xinglei Dou
:
QuCloud+: A Holistic Qubit Mapping Scheme for Single/Multi-programming on 2D/3D NISQ Quantum Computers. 9:1-9:27 - Lingxi Wu

, Minxuan Zhou
, Weihong Xu
, Ashish Venkat
, Tajana Rosing
, Kevin Skadron
:
Abakus: Accelerating k-mer Counting with Storage Technology. 10:1-10:26 - Seokwon Kang

, Jongbin Kim
, Gyeongyong Lee
, Jeongmyung Lee
, Jiwon Seo
, Hyungsoo Jung
, Yong Ho Song
, Yongjun Park
:
ISP Agent: A Generalized In-storage-processing Workload Offloading Framework by Providing Multiple Optimization Opportunities. 11:1-11:24 - Prasoon Mishra

, V. Krishna Nandivada
:
COWS for High Performance: Cost Aware Work Stealing for Irregular Parallel Loop. 12:1-12:26 - Joongun Park

, Seunghyo Kang
, Sanghyeon Lee
, Taehoon Kim
, Jongse Park
, Youngjin Kwon
, Jaehyuk Huh
:
Hardware-hardened Sandbox Enclaves for Trusted Serverless Computing. 13:1-13:25 - Tyler N. Allen

, Bennett Cooper
, Rong Ge
:
Fine-grain Quantitative Analysis of Demand Paging in Unified Virtual Memory. 14:1-14:24 - Zhonghua Wang

, Yixing Guo
, Kai Lu
, Jiguang Wan
, Daohui Wang
, Ting Yao
, Huatao Wu
:
Rcmp: Reconstructing RDMA-Based Memory Disaggregation via CXL. 15:1-15:26 - Linbo Long

, Shuiyong He
, Jingcheng Shen
, Renping Liu
, Zhenhua Tan
, Congming Gao
, Duo Liu
, Kan Zhong
, Yi Jiang
:
WA-Zone: Wear-Aware Zone Management Optimization for LSM-Tree on ZNS SSDs. 16:1-16:23 - Zhihua Fan

, Wenming Li
, Zhen Wang
, Yu Yang
, Xiaochun Ye
, Dongrui Fan
, Ninghui Sun
, Xuejun An
:
Improving Utilization of Dataflow Unit for Multi-Batch Processing. 17:1-17:26 - Dunbo Zhang

, Qingjie Lang
, Ruoxi Wang
, Li Shen
:
Extension VM: Interleaved Data Layout in Vector Memory. 18:1-18:23 - Can Firtina

, Kamlesh R. Pillai
, Gurpreet S. Kalsi
, Bharathwaj Suresh
, Damla Senol Cali
, Jeremie S. Kim
, Taha Shahroodi
, Meryem Banu Cavlak
, Joël Lindegger
, Mohammed Alser
, Juan Gómez-Luna
, Sreenivas Subramoney
, Onur Mutlu
:
ApHMM: Accelerating Profile Hidden Markov Models for Fast and Energy-efficient Genome Analysis. 19:1-19:29 - Khalid Ahmad

, Cris Cecka
, Michael Garland
, Mary W. Hall
:
Exploring Data Layout for Sparse Tensor Times Dense Matrix on GPUs. 20:1-20:20
Volume 21, Number 2, June 2024
- Chandra Sekhar Mummidi

, Victor da Cruz Ferreira
, Sudarshan Srinivasan
, Sandip Kundu
:
Highly Efficient Self-checking Matrix Multiplication on Tiled AMX Accelerators. 21 - Zhonghua Wang

, Chen Ding
, Fengguang Song
, Kai Lu
, Jiguang Wan
, Zhihu Tan
, Changsheng Xie
, Guokuan Li
:
WIPE: A Write-Optimized Learned Index for Persistent Memory. 22 - Gino A. Chacon

, Charles Williams
, Johann Knechtel
, Ozgur Sinanoglu
, Paul V. Gratz
, Vassos Soteriou
:
Coherence Attacks and Countermeasures in Interposer-based Chiplet Systems. 23 - Yan Wei

, Xingjun Zhang
:
A Concise Concurrent B+-Tree for Persistent Memory. 24 - Fareed Qararyah

, Muhammad Waqar Azhar
, Pedro Trancoso
:
An Efficient Hybrid Deep Learning Accelerator for Compact and Heterogeneous CNNs. 25 - Fernando Fernandes dos Santos

, Luigi Carro
, Flavio Vella
, Paolo Rech
:
Assessing the Impact of Compiler Optimizations on GPUs Reliability. 26 - Valentin Isaac-Chassande

, Adrian Evans
, Yves Durand
, Frédéric Rousseau:
Dedicated Hardware Accelerators for Processing of Sparse Matrices and Vectors: A Survey. 27 - Benyi Xie

, Yue Yan
, Chenghao Yan
, Sicheng Tao
, Zhuangzhuang Zhang
, Xinyu Li
, Yanzhi Lan
, Xiang Wu
, Tianyi Liu
, Tingting Zhang
, Fuxin Zhang
:
An Instruction Inflation Analyzing Framework for Dynamic Binary Translators. 28 - Samuel Rac

, Mats Brorsson
:
Cost-aware Service Placement and Scheduling in the Edge-Cloud Continuum. 29 - Feng Xue

, Chenji Han
, Xinyu Li
, Junliang Wu
, Tingting Zhang
, Tianyi Liu
, Yifan Hao
, Zidong Du
, Qi Guo
, Fuxin Zhang
:
Tyche: An Efficient and General Prefetcher for Indirect Memory Accesses. 30 - Kunpeng Xie

, Ye Lu
, Xinyu He
, Dezhi Yi
, Huijuan Dong
, Yao Chen
:
Winols: A Large-Tiling Sparse Winograd CNN Accelerator on FPGAs. 31 - Ke Liu

, Kan Wu
, Hua Wang
, Ke Zhou
, Peng Wang
, Ji Zhang
, Cong Li
:
SLAP: Segmented Reuse-Time-Label Based Admission Policy for Content Delivery Network Caching. 32 - Panagiotis Miliadis

, Dimitris Theodoropoulos
, Dionisios N. Pnevmatikatos
, Nectarios Koziris
:
Architectural Support for Sharing, Isolating and Virtualizing FPGA Resources. 33 - Haitao Du

, Yuhan Qin
, Song Chen
, Yi Kang
:
FASA-DRAM: Reducing DRAM Latency with Destructive Activation and Delayed Restoration. 34 - Michael Canesche

, Vanderson Martins do Rosário
, Edson Borin
, Fernando Magno Quintão Pereira
:
The Droplet Search Algorithm for Kernel Scheduling. 35 - Asmita Pal

, Keerthana Desai
, Rahul Chatterjee
, Joshua San Miguel
:
Camouflage: Utility-Aware Obfuscation for Accurate Simulation of Sensitive Program Traces. 36 - Chengying Huan

, Yongchao Liu
, Heng Zhang
, Shuaiwen Song
, Santosh Pandey
, Shiyang Chen
, Xiangfei Fang
, Yue Jin
, Baptiste Lepers
, Yanjun Wu
, Hang Liu
:
TEA+: A Novel Temporal Graph Random Walk Engine with Hybrid Storage Architecture. 37 - Soojin Hwang

, Daehyeon Baek
, Jongse Park
, Jaehyuk Huh
:
Cerberus: Triple Mode Acceleration of Sparse Matrix and Vector Multiplication. 38 - Siddhartha Raman Sundara Raman

, Lizy Kurian John
, Jaydeep P. Kulkarni
:
NEM-GNN: DAC/ADC-less, Scalable, Reconfigurable, Graph and Sparsity-Aware Near-Memory Accelerator for Graph Neural Networks. 39 - Yan Chen

, Qiwen Ke
, Huiba Li
, Yongwei Wu
, Yiming Zhang
:
xMeta: SSD-HDD-hybrid Optimization for Metadata Maintenance of Cloud-scale Object Storage. 40 - Vidush Singhal

, Laith Sakka
, Kirshanthan Sundararajah
, Ryan Newton
, Milind Kulkarni
:
Orchard: Heterogeneous Parallelism and Fine-grained Fusion for Complex Tree Traversals. 41
Volume 21, Number 3, September 2024
- Hajar Falahati

, Mohammad Sadrosadati
, Qiumin Xu
, Juan Gómez-Luna
, Banafsheh Saber Latibari
, Hyeran Jeon
, Shaahin Hessabi
, Hamid Sarbazi-Azad
, Onur Mutlu
, Murali Annavaram
, Massoud Pedram
:
Cross-core Data Sharing for Energy-efficient GPUs. 42:1-42:32 - Ching-Jui Lee

, Tsung Tai Yeh
:
ReSA: Reconfigurable Systolic Array for Multiple Tiny DNN Tensors. 43:1-43:24 - Ziheng Wang

, Xiaoshe Dong
, Yan Kang
, Heng Chen
, Qiang Wang
:
An Example of Parallel Merkle Tree Traversal: Post-Quantum Leighton-Micali Signature on the GPU. 44:1-44:25 - Jiang Wu

, Zhuo Zhang
, Deheng Yang
, Jianjun Xu
, Jiayu He
, Xiaoguang Mao
:
Knowledge-Augmented Mutation-Based Bug Localization for Hardware Design Code. 45:1-45:26 - Chen Ding

, Jian Zhou
, Kai Lu
, Sicen Li
, Yiqin Xiong
, Jiguang Wan
, Ling Zhan
:
D2Comp: Efficient Offload of LSM-tree Compaction with Data Processing Units on Disaggregated Storage. 46:1-46:22 - Zhuohao Wang

, Lei Liu
, Limin Xiao
:
iSwap: A New Memory Page Swap Mechanism for Reducing Ineffective I/O Operations in Cloud Environments. 47:1-47:24 - Junkaixuan Li

, Yi Kang
:
GraphSER: Distance-Aware Stream-Based Edge Repartition for Many-Core Systems. 48:1-48:25 - Ke Wu

, Dezun Dong
, Weixia Xu
:
COER: A Network Interface Offloading Architecture for RDMA and Congestion Control Protocol Codesign. 49:1-49:26 - Qunyou Liu

, Darong Huang
, Luis Costero
, Marina Zapater
, David Atienza
:
Intermediate Address Space: virtual memory optimization of heterogeneous architectures for cache-resident workloads. 50:1-50:23 - Dongmoon Min

, Ilkwon Byun
, Gyu-hyeon Lee
, Jangwoo Kim
:
CoolDC: A Cost-Effective Immersion-Cooled Datacenter with Workload-Aware Temperature Scaling. 51:1-51:27 - Hai Zhou

, Dan Feng
:
Stripe-schedule Aware Repair in Erasure-coded Clusters with Heterogeneous Star Networks. 52:1-52:24 - Bobin Deng

, Bhargava Nadendla
, Kun Suo
, Chloe Yixin Xie
, Dan Chia-Tien Lo
:
Fixed-point Encoding and Architecture Exploration for Residue Number Systems. 53:1-53:27 - Yizhuo Wang

, Fangli Chang
, Bingxin Wei
, Jianhua Gao
, Weixing Ji
:
Optimization of Sparse Matrix Computation for Algebraic Multigrid on GPUs. 54:1-54:27 - Luming Wang

, Xu Zhang
, Songyue Wang
, Zhuolun Jiang
, Tianyue Lu
, Mingyu Chen
, Siwei Luo
, Keji Huang
:
Asynchronous Memory Access Unit: Exploiting Massive Parallelism for Far Memory Access. 55:1-55:28 - Yunping Zhao

, Sheng Ma
, Hengzhu Liu
, Dongsheng Li
:
SAL: Optimizing the Dataflow of Spin-based Architectures for Lightweight Neural Networks. 56:1-56:27 - Kai Lu

, Siqi Zhao
, Haikang Shan
, Qiang Wei
, Guokuan Li
, Jiguang Wan
, Ting Yao
, Huatao Wu
, Daohui Wang
:
Scythe: A Low-latency RDMA-enabled Distributed Transaction System for Disaggregated Memory. 57:1-57:26 - Wangqi Peng

, Yusen Li
, Xiaoguang Liu
, Gang Wang
:
Lavender: An Efficient Resource Partitioning Framework for Large-Scale Job Colocation. 58:1-58:23 - Feng Zhang

, Fulin Nan
, Binbin Xu
, Zhirong Shen
, Jiebin Zhai
, Dmitrii I. Kaplun
, Jiwu Shu
:
Achieving Tunable Erasure Coding with Cluster-Aware Redundancy Transitioning. 59:1-59:24 - Ataberk Olgun

, F. Nisa Bostanci
, Geraldo Francisco de Oliveira Junior
, Yahya Can Tugrul
, Rahul Bera
, Abdullah Giray Yaglikçi
, Hasan Hassan
, Oguz Ergin
, Onur Mutlu
:
Sectored DRAM: A Practical Energy-Efficient and High-Performance Fine-Grained DRAM Architecture. 60:1-60:29 - Xiaohui Wei

, Chenyang Wang
, Hengshan Yue
, Jingweijia Tan
, Zeyu Guan
, Nan Jiang
, Xinyang Zheng
, Jianpeng Zhao
, Meikang Qiu
:
ReIPE: Recycling Idle PEs in CNN Accelerator for Vulnerable Filters Soft-Error Detection. 61:1-61:26 - Qiao Li

, Yu Chen
, Guanyu Wu
, Yajuan Du
, Min Ye
, Xinbiao Gan
, Jie Zhang
, Zhirong Shen
, Jiwu Shu
, Chun Xue
:
Characterizing and Optimizing LDPC Performance on 3D NAND Flash Memories. 62:1-62:26 - Jiahong Xu

, Haikun Liu
, Zhuohui Duan
, Xiaofei Liao
, Hai Jin
, Xiaokang Yang
, Huize Li
, Cong Liu
, Fubing Mao
, Yu Zhang
:
ReHarvest: An ADC Resource-Harvesting Crossbar Architecture for ReRAM-Based DNN Accelerators. 63:1-63:26 - Jiang Wu

, Zhuo Zhang
, Deheng Yang
, Jianjun Xu
, Jiayu He
, Xiaoguang Mao
:
Time-Aware Spectrum-Based Bug Localization for Hardware Design Code with Data Purification. 64:1-64:25
Volume 21, Number 4, December 2024
- Zhuoran Song

, Zhongkai Yu
, Xinkai Song
, Yifan Hao
, Li Jiang
, Naifeng Jing
, Xiaoyao Liang
:
Environmental Condition Aware Super-Resolution Acceleration Framework in Server-Client Hierarchies. 65:1-65:26 - Georgia Antoniou

, Davide B. Bartolini
, Haris Volos
, Marios Kleanthous
, Zhe Wang
, Kleovoulos Kalaitzidis
, Tom Rollet
, Ziwei Li
, Onur Mutlu
, Yiannakis Sazeides
, Jawad Haj-Yahya
:
Agile C-states: A Core C-state Architecture for Latency Critical Applications Optimizing both Transition and Cold-Start Latency. 66:1-66:26 - Xinbiao Gan

, Tiejun Li
, Feng Xiong
, Bo Yang
, Xinhai Chen
, Chunye Gong
, Shijie Li
, Kai Lu
, Qiao Li
, Yiming Zhang
:
MST: Topology-Aware Message Aggregation for Exascale Graph Processing of Traversal-Centric Algorithms. 67:1-67:22 - Yujie Cui

, Wei Chen
, Xu Cheng
, Jiangfang Yi
:
Hyperion: A Highly Effective Page and PC Based Delta Prefetcher. 68:1-68:27 - Jianhua Gao

, Weixing Ji
, Yizhuo Wang
:
Optimization of Large-Scale Sparse Matrix-Vector Multiplication on Multi-GPU Systems. 69:1-69:24 - Zhengding Hu

, Jingwei Sun
, Zhongyang Li
, Guangzhong Sun
:
AG-SpTRSV: An Automatic Framework to Optimize Sparse Triangular Solve on GPUs. 70:1-70:25 - Wenbo Zhang

, Yiqi Liu
, Tianhao Zang
, Zhenshan Bao
:
EA4RCA: Efficient AIE accelerator design framework for regular Communication-Avoiding Algorithm. 71:1-71:24 - Arun Thangamani

, Vincent Loechner
, Stéphane Genaud
:
A Survey of General-purpose Polyhedral Compilers. 72:1-72:26 - Junqing Lin

, Jingwei Sun
, Xiaolong Shi
, Honghe Zhang
, Xianzhi Yu
, Xinzhi Wang
, Jun Yao
, Guangzhong Sun
:
LO-SpMM: Low-cost Search for High-performance SpMM Kernels on GPUs. 73:1-73:25 - Chenglong Yi

, Jintong Liu
, Shenggang Wan
, Juntao Fang
, Bin Sun
, Liqiang Zhang
:
Data Deduplication Based on Content Locality of Transactions to Enhance Blockchain Scalability. 74:1-74:24 - Joshua Dennis Booth

, Phillip Allen Lane
:
A NUMA-Aware Version of an Adaptive Self-Scheduling Loop Scheduler. 75:1-75:22 - Yu Tang

, Qiao Li
, Lujia Yin
, Dongsheng Li
, Yiming Zhang
, Chenyu Wang
, Xingcheng Zhang
, Linbo Qiao
, Zhaoning Zhang
, Kai Lu
:
DELTA: Memory-Efficient Training via Dynamic Fine-Grained Recomputation and Swapping. 76:1-76:25 - Zhenhua Tan

, Linbo Long
, Jingcheng Shen
, Renping Liu
, Congming Gao
, Kan Zhong
, Yi Jiang
:
Optimizing Garbage Collection for ZNS SSDs via In-storage Data Migration and Address Remapping. 77:1-77:25 - Xiang Li

, Qiong Chang
, Aolong Zha
, Shijie Chang
, Yun Li
, Jun Miyazaki
:
An Optimized GPU Implementation for GIST Descriptor. 78:1-78:24 - Xiaobo Lu

, Jianbin Fang
, Lin Peng
, Chun Huang
, Zidong Du
, Yongwei Zhao
, Zheng Wang
:
Mentor: A Memory-Efficient Sparse-dense Matrix Multiplication Accelerator Based on Column-Wise Product. 79:1-79:25 - Yu Feng

, Weikai Lin
, Zihan Liu
, Jingwen Leng
, Minyi Guo
, Han Zhao
, Xiaofeng Hou
, Jieru Zhao
, Yuhao Zhu
:
Potamoi: Accelerating Neural Rendering via a Unified Streaming Architecture. 80:1-80:25 - Changxi Liu

, Alen Sabu
, Akanksha Chaudhari
, Qingxuan Kang
, Trevor E. Carlson
:
Pac-Sim: Simulation of Multi-threaded Workloads using Intelligent, Live Sampling. 81:1-81:26 - Saurabh Raje

, Yufan Xu
, Atanas Rountev
, Edward F. Valeev
, P. Sadayappan
:
CoNST: Code Generator for Sparse Tensor Networks. 82:1-82:24 - Danlin Jia

, Geng Yuan
, Yiming Xie
, Xue Lin
, Ningfang Mi
:
A Data-Loader Tunable Knob to Shorten GPU Idleness for Distributed Deep Learning. 83:1-83:25 - Shaobu Wang

, Guangyan Zhang
, Junyu Wei
, Yang Wang
, Jiesheng Wu
, Qingchao Luo
:
Understanding Silent Data Corruption in Processors for Mitigating its Effects. 84:1-84:27 - Yen-Yu Lu

, Chin-Hsien Wu
, Shih-Jen Li
, Cheng-Tze Lee
, Cheng-Yen Wu
:
A Stable Idle Time Detection Platform for Real I/O Workloads. 85:1-85:23 - Lingyu Sun

, Xiaofeng Hou
, Chao Li
, Jiacheng Liu
, Xinkai Wang
, Quan Chen
, Minyi Guo
:
A2: Towards Accelerator Level Parallelism for Autonomous Micromobility Systems. 86:1-86:20 - Manojna Sistla

, Yiding Liu
, Xin Fu
:
Towards High Performance QNNs via Distribution-Based CNOT Gate Reduction. 87:1-87:22 - Fubing Mao

, Xu Liu
, Yu Zhang
, Haikun Liu
, Xiaofei Liao
, Hai Jin
, Wei Zhang
, Jian Zhou
, Yufei Wu
, Longyu Nie
, Yapu Guo
, Zihan Jiang
, Jingkang Liu
:
PMGraph: Accelerating Concurrent Graph Queries over Streaming Graphs. 88:1-88:25 - Wentong Li

, Yina Lv
, Longfei Luo
, Yunpeng Song
, Liang Shi
:
Access Characteristic-Guided Remote Swapping Across Mobile Devices. 89:1-89:25 - Yinan Zhang

, Shun Yang
, Huiqi Hu
, Chengcheng Yang
, Peng Cai
, Xuan Zhou
:
SuccinctKV: a CPU-efficient LSM-tree Based KV Store with Scan-based Compaction. 90:1-90:26 - Siyuan Ma

, Kaustubh Manohar Mhatre
, Jian Weng
, Bagus Hanindhito
, Zhengrong Wang
, Tony Nowatzki
, Lizy K. John
, Aman Arora
:
PIMSAB: A Processing-In-Memory System with Spatially-Aware Communication and Bit-Serial-Aware Computation. 91:1-91:27

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














