


default search action
Haonan Zhang 0003
Person information
- affiliation: University of Electronic Science and Technology of China (UESTC), Future Media Center, School of Computer Science and Engineering, Chengdu, China
- affiliation: Sichuan Artificial Intelligence Research Institute, Yibin, China
Other persons with the same name
- Haonan Zhang — disambiguation page
- Haonan Zhang 0001
— Fudan University, Department of Computer Science and Technology, School of Computer Science, Shanghai, China - Haonan Zhang 0002
— Xi'an Jiaotong University, Institute of Artificial Intelligence and Robotics, Shaanxi, China - Haonan Zhang 0004
— Shanghai Jiao Tong University, School of Electronic Information and Electrical Engineering, China - Haonan Zhang 0005
— Southwest JiaoTong University, School of Information Science and Technology, Chengdu, China - Haonan Zhang 0006
— University of Waterloo, ON, Canada (and 1 more) - Haonan Zhang 0007
— Zhejiang University, College of Control Science and Engineering, Hangzhou, China
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Journal Articles
- 2025
[j8]Yixin Qin
, Lei Zhao
, Lianli Gao
, Haonan Zhang, Pengpeng Zeng, Heng Tao Shen
:
Temporal-Guided Mixture-of-Experts for Zero-Shot Video Question Answering. IEEE Trans. Circuits Syst. Video Technol. 35(9): 9003-9016 (2025)
[j7]Haonan Zhang
, Pengpeng Zeng
, Lianli Gao
, Jingkuan Song
, Yihang Duan, Xinyu Lyu
, Heng Tao Shen
:
Text-Video Retrieval With Global-LocalSemantic Consistent Learning. IEEE Trans. Image Process. 34: 3463-3474 (2025)
[j6]Pengpeng Zeng
, Haonan Zhang
, Lianli Gao
, Xiangpeng Li, Jin Qian, Heng Tao Shen
:
Visual Commonsense-Aware Representation Network for Video Captioning. IEEE Trans. Neural Networks Learn. Syst. 36(1): 1092-1103 (2025)- 2024
[j5]Haonan Zhang
, Pengpeng Zeng
, Lianli Gao
, Xinyu Lyu
, Jingkuan Song
, Heng Tao Shen
:
SPT: Spatial Pyramid Transformer for Image Captioning. IEEE Trans. Circuits Syst. Video Technol. 34(6): 4829-4842 (2024)
[j4]Haonan Zhang
, Pengpeng Zeng
, Lianli Gao
, Jingkuan Song
, Heng Tao Shen
:
Ump: Unified Modality-Aware Prompt Tuning for Text-Video Retrieval. IEEE Trans. Circuits Syst. Video Technol. 34(11): 11954-11964 (2024)
[j3]Shuaiqi Jing
, Haonan Zhang
, Pengpeng Zeng
, Lianli Gao
, Jingkuan Song
, Heng Tao Shen
:
Memory-Based Augmentation Network for Video Captioning. IEEE Trans. Multim. 26: 2367-2379 (2024)- 2023
[j2]Haonan Zhang, Pengpeng Zeng, Yuxuan Hu, Jin Qian, Jingkuan Song, Lianli Gao:
Learning visual question answering on controlled semantic noisy labels. Pattern Recognit. 138: 109339 (2023)- 2022
[j1]Pengpeng Zeng
, Haonan Zhang
, Lianli Gao
, Jingkuan Song
, Heng Tao Shen
:
Video Question Answering With Prior Knowledge and Object-Sensitive Learning. IEEE Trans. Image Process. 31: 5936-5948 (2022)
Conference and Workshop Papers
- 2025
[c7]Run Luo, Haonan Zhang, Longze Chen, Ting-En Lin, Xiong Liu, Yuchuan Wu, Min Yang, Yongbin Li, Minzheng Wang, Pengpeng Zeng, Lianli Gao, Heng Tao Shen, Yunshui Li, Hamid Alinejad-Rokny, Xiaobo Xia, Jingkuan Song, Fei Huang:
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct. ACL (Findings) 2025: 19655-19682
[c6]Haonan Zhang, Run Luo, Xiong Liu, Yuchuan Wu, Ting-En Lin, Pengpeng Zeng, Qiang Qu, Feiteng Fang, Min Yang, Lianli Gao, Jingkuan Song, Fei Huang, Yongbin Li:
OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction. ACL (1) 2025: 26318-26331- 2024
[c5]Hao Ni, Ping Lai, Yuke Li, Pengpeng Zeng, Haonan Zhang, Jingkuan Song:
Pedestrian Attributes Recognition for UAV-Human. ICME Workshops 2024: 1-5
[c4]Haonan Zhang
, Pengpeng Zeng
, Lianli Gao
, Jingkuan Song
, Heng Tao Shen
:
MPT: Multi-grained Prompt Tuning for Text-Video Retrieval. ACM Multimedia 2024: 1206-1214- 2023
[c3]Haonan Zhang
, Lianli Gao
, Pengpeng Zeng
, Alan Hanjalic
, Heng Tao Shen
:
Depth-Aware Sparse Transformer for Video-Language Learning. ACM Multimedia 2023: 4778-4787- 2022
[c2]Pengpeng Zeng, Haonan Zhang, Jingkuan Song, Lianli Gao:
S2 Transformer for Image Captioning. IJCAI 2022: 1608-1614
[c1]Hao Li, Jingkuan Song, Lianli Gao, Pengpeng Zeng, Haonan Zhang, Gongfu Li:
A Differentiable Semantic Metric Approximation in Probabilistic Embedding for Cross-Modal Retrieval. NeurIPS 2022
Informal and Other Publications
- 2025
[i5]Run Luo, Ting-En Lin, Haonan Zhang, Yuchuan Wu, Xiong Liu, Min Yang, Yongbin Li, Longze Chen, Jiaming Li, Lei Zhang, Yangyi Chen, Hamid Alinejad-Rokny, Fei Huang:
OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis. CoRR abs/2501.04561 (2025)
[i4]Haonan Zhang, Run Luo, Xiong Liu, Yuchuan Wu, Ting-En Lin, Pengpeng Zeng, Qiang Qu, Feiteng Fang, Min Yang, Lianli Gao, Jingkuan Song, Fei Huang, Yongbin Li:
OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction. CoRR abs/2505.20277 (2025)
[i3]Feiteng Fang, Ting-En Lin, Yuchuan Wu, Xiong Liu, Xiang Huang, Dingwei Chen, Jing Ye, Haonan Zhang, Liang Zhu, Hamid Alinejad-Rokny, Min Yang, Fei Huang, Yongbin Li:
ChARM: Character-based Act-adaptive Reward Modeling for Advanced Role-Playing Language Agents. CoRR abs/2505.23923 (2025)- 2024
[i2]Run Luo, Haonan Zhang, Longze Chen, Ting-En Lin, Xiong Liu, Yuchuan Wu, Min Yang, Minzheng Wang, Pengpeng Zeng, Lianli Gao, Heng Tao Shen, Yunshui Li, Xiaobo Xia, Fei Huang, Jingkuan Song, Yongbin Li:
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct. CoRR abs/2409.05840 (2024)- 2022
[i1]Pengpeng Zeng, Haonan Zhang, Lianli Gao, Xiangpeng Li, Jin Qian, Heng Tao Shen:
Visual Commonsense-aware Representation Network for Video Captioning. CoRR abs/2211.09469 (2022)
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from
to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the
of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from
,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from
and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from
.
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2025-11-04 00:37 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID







