


default search action
Zhihao Zhang 0001
Person information
- affiliation: Carnegie Mellon University, Pittsburgh, PA, USA
Other persons with the same name
- Zhihao Zhang (aka: Zhi-hao Zhang, Zhi-Hao Zhang, Zhi Hao Zhang) — disambiguation page
- Zhihao Zhang 0002 — Fudan University, School of Computer Science, Shanghai, China
- Zhihao Zhang 0003
— Nanjing Tech University, College of Electrical Engineering and Control Science, Nanjing, China (and 2 more)
- Zhihao Zhang 0004
— Beijing University of Technology, School of Economics and Management, Beijing, China
- Zhihao Zhang 0005
— Subtle Medical Inc., Shanghai, China (and 1 more)
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2025
- [c6]Lijie Yang, Zhihao Zhang, Zhuofu Chen, Zikun Li, Zhihao Jia:
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention. ICLR 2025 - [i10]Zhihao Jia, Qi Pang, Trung Tran, David P. Woodruff, Zhihao Zhang, Wenting Zheng:
Communication Bounds for the Distributed Experts Problem. CoRR abs/2501.03132 (2025) - [i9]Zikun Li, Zhuofu Chen, Remi Delacourt, Gabriele Oliaro, Zeyu Wang, Qinghan Chen, Shuhuai Lin, April Yang, Zhihao Zhang, Zhuoming Chen, Sean Lai, Xupeng Miao, Zhihao Jia:
AdaServe: SLO-Customized LLM Serving with Fine-Grained Speculative Decoding. CoRR abs/2501.12162 (2025) - [i8]Rui Pan
, Yinwei Dai, Zhihao Zhang, Gabriele Oliaro, Zhihao Jia, Ravi Netravali:
SpecReason: Fast and Accurate Inference-Time Compute via Speculative Reasoning. CoRR abs/2504.07891 (2025) - 2024
- [c5]Zhengxin Zhang, Dan Zhao, Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Qing Li, Yong Jiang, Zhihao Jia:
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models. ACL (1) 2024: 1-17 - [c4]Xupeng Miao
, Gabriele Oliaro
, Zhihao Zhang
, Xinhao Cheng
, Zeyu Wang
, Zhengxin Zhang
, Rae Ying Yee Wong
, Alan Zhu
, Lijie Yang
, Xiaoxiang Shi
, Chunan Shi
, Zhuoming Chen
, Daiyaan Arfeen
, Reyna Abhyankar
, Zhihao Jia
:
SpecInfer: Accelerating Large Language Model Serving with Tree-based Speculative Inference and Verification. ASPLOS (3) 2024: 932-949 - [c3]Zhihao Zhang, Alan Zhu, Lijie Yang, Yihua Xu, Lanting Li, Phitchaya Mangpo Phothilimthana, Zhihao Jia:
Accelerating Iterative Retrieval-augmented Language Model Serving with Speculation. ICML 2024 - [c2]Zhihao Jia, Qi Pang, Trung Tran, David P. Woodruff, Zhihao Zhang, Wenting Zheng:
Communication Bounds for the Distributed Experts Problem. NeurIPS 2024 - [i7]Zhihao Zhang, Alan Zhu, Lijie Yang, Yihua Xu, Lanting Li, Phitchaya Mangpo Phothilimthana, Zhihao Jia:
Accelerating Retrieval-Augmented Language Model Serving with Speculation. CoRR abs/2401.14021 (2024) - [i6]Lijie Yang, Zhihao Zhang, Zhuofu Chen, Zikun Li, Zhihao Jia:
TidalDecode: Fast and Accurate LLM Decoding with Position Persistent Sparse Attention. CoRR abs/2410.05076 (2024) - 2023
- [i5]Xupeng Miao, Gabriele Oliaro
, Zhihao Zhang, Xinhao Cheng, Zeyu Wang, Rae Ying Yee Wong, Zhuoming Chen, Daiyaan Arfeen, Reyna Abhyankar, Zhihao Jia:
SpecInfer: Accelerating Generative LLM Serving with Speculative Inference and Token Tree Verification. CoRR abs/2305.09781 (2023) - [i4]Xupeng Miao, Gabriele Oliaro
, Zhihao Zhang, Xinhao Cheng, Hongyi Jin, Tianqi Chen, Zhihao Jia:
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems. CoRR abs/2312.15234 (2023) - 2022
- [j1]Jiachen Li
, Hengbo Ma
, Zhihao Zhang, Jinning Li, Masayoshi Tomizuka
:
Spatio-Temporal Graph Dual-Attention Network for Multi-Agent Prediction and Tracking. IEEE Trans. Intell. Transp. Syst. 23(8): 10556-10569 (2022) - [c1]Zhihao Zhang, Zhihao Jia:
GradSign: Model Performance Inference with Theoretical Insights. ICLR 2022 - 2021
- [i3]Jiachen Li, Hengbo Ma, Zhihao Zhang, Jinning Li, Masayoshi Tomizuka:
Spatio-Temporal Graph Dual-Attention Network for Multi-Agent Prediction and Tracking. CoRR abs/2102.09117 (2021) - [i2]Zhihao Zhang, Zhihao Jia:
GradSign: Model Performance Inference with Theoretical Insights. CoRR abs/2110.08616 (2021) - 2020
- [i1]Jiachen Li, Hengbo Ma, Zhihao Zhang, Masayoshi Tomizuka:
Social-WaGDAT: Interaction-aware Trajectory Prediction via Wasserstein Graph Double-Attention Network. CoRR abs/2002.06241 (2020)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-09-03 01:17 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint