


default search action
63rd ACL 2025: Vienna, Austria - Long Papers
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar:

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2025, Vienna, Austria, July 27 - August 1, 2025. Association for Computational Linguistics 2025, ISBN 979-8-89176-251-0 - Frontmatter.

- Weiqi Wang, Limeng Cui, Xin Liu, Sreyashi Nag, Wenju Xu, Chen Luo, Sheikh Muhammad Sarwar, Yang Li, Hansu Gu, Hui Liu, Changlong Yu, Jiaxin Bai, Yifan Gao, Haiyang Zhang, Qi He, Shuiwang Ji, Yangqiu Song:

EcomScriptBench: A Multi-task Benchmark for E-commerce Script Planning via Step-wise Intention-Driven Product Association. 1-22 - Bo Pan, Zhen Xiong, Guanchen Wu, Zheng Zhang, Yifei Zhang, Yuntong Hu, Liang Zhao:

GraphNarrator: Generating Textual Explanations for Graph Neural Networks. 23-42 - Srishti Gureja, Lester James Validad Miranda, Shayekh Bin Islam, Rishabh Maheshwary, Drishti Sharma, Gusti Triandi Winata, Nathan Lambert, Sebastian Ruder, Sara Hooker, Marzieh Fadaee:

M-RewardBench: Evaluating Reward Models in Multilingual Settings. 43-58 - Xinwei Yang, Zhaofeng Liu, Chen Huang, Jiashuai Zhang, Tong Zhang, Yifan Zhang, Wenqiang Lei:

ELABORATION: A Comprehensive Benchmark on Human-LLM Competitive Programming. 59-104 - Jacy Reese Anthis, Kristian Lum, Michael D. Ekstrand, Avi Feller, Chenhao Tan:

The Impossibility of Fair LLMs. 105-120 - Ermo Hua, Biqing Qi, Kaiyan Zhang, Kai Tian, Xingtai Lv, Ning Ding, Bowen Zhou:

Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process. 121-136 - Kristian Lum, Jacy Reese Anthis, Kevin Robinson, Chirag Nagpal, Alexander Nicholas D'Amour:

Bias in Language Models: Beyond Trick Tests and Towards RUTEd Evaluation. 137-161 - Wenhan Liu, Xinyu Ma, Yutao Zhu, Ziliang Zhao, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou:

Sliding Windows Are Not the End: Exploring Full Ranking with Long-Context Large Language Models. 162-176 - Aaron Nicolson, Shengyao Zhuang, Jason Dowling, Bevan Koopman:

The Impact of Auxiliary Patient Data on Automated Chest X-Ray Report Generation and How to Incorporate It. 177-203 - Jingheng Ye, Zishan Xu, Yinghui Li, Linlin Song, Qingyu Zhou, Hai-Tao Zheng, Ying Shen, Wenhao Jiang, Hong-Gee Kim, Ruitong Liu, Xin Su, Zifei Shan:

CLEME2.0: Towards Interpretable Evaluation by Disentangling Edits for Grammatical Error Correction. 204-222 - Zhouhong Gu, Haoning Ye, Xingzhou Chen, Zeyang Zhou, Hongwei Feng, Yanghua Xiao:

StrucText-Eval: Evaluating Large Language Model's Reasoning Ability in Structure-Rich Text. 223-244 - Haokun Liu, Yangqiaoyu Zhou, Mingxuan Li, Chenfei Yuan, Chenhao Tan:

Literature Meets Data: A Synergistic Approach to Hypothesis Generation. 245-281 - Zhouhong Gu, Xingzhou Chen, Xiaoran Shi, Tao Wang, Suhang Zheng, Tianyu Li, Hongwei Feng, Yanghua Xiao:

GAPO: Learning Preferential Prompt through Generative Adversarial Policy Optimization. 282-296 - Ziyang Luo, Kaixin Li, Hongzhan Lin, Yuchen Tian, Mohan S. Kankanhalli, Jing Ma:

Tree-of-Evolution: Tree-Structured Instruction Evolution for Code Generation in Large Language Models. 297-316 - Seunguk Yu, Juhwan Choi, YoungBin Kim:

Delving into Multilingual Ethical Bias: The MSQAD with Statistical Hypothesis Tests for Large Language Models. 317-340 - Dosung Lee, Wonjun Oh, Boyoung Kim, Minyoung Kim, Joonsuk Park, Paul Hongsuck Seo:

ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision. 341-359 - Hongzhan Lin, Yang Deng, Yuxuan Gu, Wenxuan Zhang, Jing Ma, See-Kiong Ng, Tat-Seng Chua:

FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models. 360-381 - Loïc Fosse, Frédéric Béchet, Benoît Favre, Géraldine Damnati, Gwénolé Lecorvé, Maxime Darrin, Philippe Formont, Pablo Piantanida:

Statistical Deficiency for Task Inclusion Estimation. 382-415 - Jabin Koo, Minwoo Jang, Jungseul Ok:

Towards Robust and Efficient Federated Low-Rank Adaptation with Heterogeneous Clients. 416-429 - Kaibo Liu, Zhenpeng Chen, Yiyang Liu, Jie M. Zhang, Mark Harman, Yudong Han, Yun Ma, Yihong Dong, Ge Li, Gang Huang:

LLM-Powered Test Case Generation for Detecting Bugs in Plausible Programs. 430-440 - Chengwei Dai, Kun Li, Wei Zhou, Songlin Hu:

Capture the Key in Reasoning to Enhance CoT Distillation Generalization. 441-465 - Chen Huang, Yang Deng, Wenqiang Lei, Jiancheng Lv, Tat-Seng Chua, Jimmy Huang:

How to Enable Effective Cooperation Between Humans and NLP Models: A Survey of Principles, Formalizations, and Beyond. 466-488 - Li Zheng, Sihang Wang, Hao Fei, Zuquan Peng, Fei Li, Jianming Fu, Chong Teng, Donghong Ji:

Enhancing Hyperbole and Metaphor Detection with Their Bidirectional Dynamic Interaction and Emotion Knowledge. 489-499 - Jun Gao, Qi Lv, Zili Wang, Tianxiang Wu, Ziqiang Cao, Wenjie Li:

UniICL: An Efficient ICL Framework Unifying Compression, Selection, and Generation. 500-510 - Maksim Aparovich, Volha Harytskaya, Vladislav Poritski, Oksana Volchek, Pavel Smrz:

BelarusianGLUE: Towards a Natural Language Understanding Benchmark for Belarusian. 511-527 - Fan Zhang, Hao Chen, Zhihong Zhu, Ziheng Zhang, Zhenxi Lin, Ziyue Qiao, Yefeng Zheng, Xian Wu:

A Survey on Foundation Language Models for Single-cell Biology. 528-549 - Ruiwen Zhou, Wenyue Hua, Liangming Pan, Sitao Cheng, Xiaobao Wu, En Yu, William Yang Wang:

RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios. 550-572 - Xinhao Xu, Jiaxin Li, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding:

Extending LLM Context Window with Adaptive Grouped Positional Encoding: A Training-Free Method. 573-587 - Sungjae Lee, Hyejin Park, Jaechang Kim, Jungseul Ok:

Semantic Exploration with Adaptive Gating for Efficient Problem Solving with Language Models. 588-606 - Arian Askari, Emmanouil Stergiadis, Ilya Gusev, Moran Beladev:

HotelMatch-LLM: Joint Multi-Task Training of Small and Large Language Models for Efficient Multimodal Hotel Retrieval. 607-619 - Jingping Liu, Ziyan Liu, Zhedong Cen, Yan Zhou, Yinan Zou, Weiyan Zhang, Haiyun Jiang, Tong Ruan:

Can Multimodal Large Language Models Understand Spatial Relations? 620-632 - Márton Kardos, Jan Kostkan, Kenneth C. Enevoldsen, Arnault-Quentin Vermillet, Kristoffer L. Nielbo, Roberta Rocca:

S³ - Semantic Signal Separation. 633-666 - Lanxiang Hu, Tajana Rosing, Hao Zhang:

TrimLLM: Progressive Layer Dropping for Domain-Specific LLMs. 667-681 - Ariel Gera, Odellia Boni, Yotam Perlitz, Roy Bar-Haim, Lilach Eden, Asaf Yehudai:

JuStRank: Benchmarking LLM Judges for System Ranking. 682-712 - Zexuan Li, Hongliang Dai, Piji Li:

Generating Diverse Training Samples for Relation Extraction with Large Language Models. 713-726 - Dominik Macko, Jakub Kopal, Róbert Móro, Ivan Srba:

MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts. 727-752 - Cilin Yan, Jingyun Wang, Lin Zhang, Ruihui Zhao, Xiaopu Wu, Kai Xiong, Qingsong Liu, Guoliang Kang, Yangyang Kang:

Efficient and Accurate Prompt Optimization: the Benefit of Memory in Exemplar-Guided Reflection. 753-779 - Aneta Zugecova, Dominik Macko, Ivan Srba, Róbert Móro, Jakub Kopál, Katarina Marcincinova, Matús Mesarcík:

Evaluation of LLM Vulnerabilities to Being Misused for Personalized Disinformation Generation. 780-797 - Cheng Qian, Peixuan Han, Qinyu Luo, Bingxiang He, Xiusi Chen, Yuji Zhang, Hongyi Du, Jiarui Yao, Xiaocheng Yang, Denghui Zhang, Yunzhu Li, Heng Ji:

EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents. 798-820 - Teng Wang, Wing Yin Yu, Zhenqi He, Zehua Liu, HaileiGong HaileiGong, Han Wu, Xiongwei Han, Wei Shi, Ruifeng She, Fangzhou Zhu, Tao Zhong:

BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving. 821-838 - Jakub Smíd, Pavel Pribán, Pavel Král:

LACA: Improving Cross-lingual Aspect-Based Sentiment Analysis with LLM Data Augmentation. 839-853 - Ning Ding, Yulin Chen, Ganqu Cui, Xingtai Lv, Weilin Zhao, Kaiyan Zhang, Ruobing Xie, Bowen Zhou, Zhiyuan Liu, Maosong Sun:

Fusing Highly Specialized Language Models for Comprehensive Expertise. 854-878 - Meng-Chieh Lee, Qi Zhu, Costas Mavromatis, Zhen Han, Soji Adeshina, Vassilis N. Ioannidis, Huzefa Rangwala, Christos Faloutsos:

HybGRAG: Hybrid Retrieval-Augmented Generation on Textual and Relational Knowledge Bases. 879-893 - Rajvardhan Oak, Muhammad Haroon, Claire Wonjeong Jo, Magdalena Wojcieszak, Anshuman Chhabra:

Re-ranking Using Large Language Models for Mitigating Exposure to Harmful Content on Social Media Platforms. 894-908 - Yidong Gan, Maciej Rybinski, Ben Hachey, Jonathan K. Kummerfeld:

Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review. 909-922 - Ziyan Liu, Chunxiao Fan, Haoran Lou, Yuexin Wu, Kaiwei Deng:

MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection. 923-947 - Wei Tang, Yixin Cao, Yang Deng, Jiahao Ying, Bo Wang, Yizhe Yang, Yuyue Zhao, Qi Zhang, Xuanjing Huang, Yu-Gang Jiang, Yong Liao:

EvoWiki: Evaluating LLMs on Evolving Knowledge. 948-964 - Yihong Dong, Yuchen Liu, Xue Jiang, Bin Gu, Zhi Jin, Ge Li:

Rethinking Repetition Problems of LLMs in Code Generation. 965-985 - Kun Ouyang, Yuanxin Liu, Shicheng Li, Yi Liu, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun:

PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension. 986-1008 - Chujie Zheng, Zhenru Zhang, Beichen Zhang, Runji Lin, Keming Lu, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin:

ProcessBench: Identifying Process Errors in Mathematical Reasoning. 1009-1024 - Chujie Zheng, Ziqi Wang, Heng Ji, Minlie Huang, Nanyun Peng:

Model Extrapolation Expedites Alignment. 1025-1041 - Yi Liu, Guoyin Wang, Shicheng Li, Feifan Song, Xu Sun:

ATLANTIS: Weak-to-Strong Learning via Importance Sampling. 1042-1052 - Zhaodan Zhang, Zhao Zhang, Jin Zhang, Hui Xu, Xueqi Cheng:

MPVStance: Mitigating Hallucinations in Stance Detection with Multi-Perspective Verification. 1053-1067 - Yaoqi Guo, Zhenpeng Chen, Jie M. Zhang, Yang Liu, Yun Ma:

Personality-Guided Code Generation Using Large Language Models. 1068-1080 - Haojie Xie, Yirong Chen, Xiaofen Xing, Jingkai Lin, Xiangmin Xu:

PsyDT: Using LLMs to Construct the Digital Twin of Psychological Counselor with Personalized Counseling Style for Psychological Counseling. 1081-1115 - Xu Zou:

BIPro: Zero-shot Chinese Poem Generation via Block Inverse Prompting Constrained Generation Framework. 1116-1134 - Chao Deng, Jiale Yuan, Pi Bu, Peijie Wang, Zhong-Zhi Li, Jian Xu, Xiao-Hui Li, Yuan Gao, Jun Song, Bo Zheng, Cheng-Lin Liu:

LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and Locating. 1135-1159 - Yu Lin, Ruining Yang, Yunlong Mao, Qizhi Zhang, Jue Hong, Quanwei Cai, Ye Wu, Huiqi Liu, Zhiyu Chen, Bing Duan, Sheng Zhong:

ObfusLM: Privacy-preserving Language Model Service against Embedding Inversion Attacks. 1160-1174 - Federico Ruggeri, Gaetano Signorelli:

Interlocking-free Selective Rationalization Through Genetic-based Learning. 1175-1191 - Lucas Georges Gabriel Charpentier, Pierre Lison:

Re-identification of De-identified Documents with Autoregressive Infilling. 1192-1209 - Haomiao Tang, Jinpeng Wang, Yuang Peng, Guanghao Meng, Ruisheng Luo, Bin Chen, Long Chen, Yaowei Wang, Shutao Xia:

Modeling Uncertainty in Composed Image Retrieval via Probabilistic Embeddings. 1210-1222 - Junfeng Tian, Da Zheng, Yang Chen, Rui Wang, Colin Zhang, Debing Zhang:

Untie the Knots: An Efficient Data Augmentation Strategy for Long-Context Pre-Training in Language Models. 1223-1242 - Honghua Dong, Qidong Su, Yubo Gao, Zhaoyu Li, Yangjun Ruan, Gennady Pekhimenko, Chris J. Maddison, Xujie Si:

APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts. 1243-1266 - Cristiano Ciaccio, Alessio Miaschi, Felice Dell'Orletta:

Evaluating Lexical Proficiency in Neural Language Models. 1267-1286 - Lingwei Meng, Long Zhou, Shujie Liu, Sanyuan Chen, Bing Han, Shujie Hu, Yanqing Liu, Jinyu Li, Sheng Zhao, Xixin Wu, Helen M. Meng, Furu Wei:

Autoregressive Speech Synthesis without Vector Quantization. 1287-1300 - Letian Peng, Zilong Wang, Feng Yao, Jingbo Shang:

Cuckoo: An IE Free Rider Hatched by Massive Nutrition in LLM's Nest. 1301-1315 - Raghav Singhal, Kaustubh Ponkshe, Praneeth Vepakomma:

FedEx-LoRA: Exact Aggregation for Federated and Efficient Fine-Tuning of Large Language Models. 1316-1336 - Rahul Zalkikar, Kanchan Chandra:

Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality. 1337-1361 - Siddharth Mangalik, Adithya V. Ganesan, Abigail B. Wheeler, Nicholas Kerry, Jeremy D. W. Clifton, H. Andrew Schwartz, Ryan L. Boyd:

Capturing Author Self Beliefs in Social Media Language. 1362-1376 - Xiaohao Yang, He Zhao, Weijie Xu, Yuanyuan Qi, Jueqing Lu, Dinh Phung, Lan Du:

Neural Topic Modeling with Large Language Models in the Loop. 1377-1401 - Abhilasha Ravichander, Shrusti Ghela, David Wadden, Yejin Choi:

HALoGEN: Fantastic LLM Hallucinations and Where to Find Them. 1402-1425 - Shuguo Hu, Jun Hu, Huaiwen Zhang:

Synergizing LLMs with Global Label Propagation for Multimodal Fake News Detection. 1426-1440 - Zi Liang, Qingqing Ye, Yanyun Wang, Sen Zhang, Yaxin Xiao, Ronghua Li, Jianliang Xu, Haibo Hu:

"Yes, My LoRD." Guiding Language Model Extraction with Locality Reinforced Distillation. 1441-1465 - Yu Wang, Xiaofei Zhou, Yichen Wang, Geyuan Zhang, Tianxing He:

Jailbreak Large Vision-Language Models Through Multi-Modal Linkage. 1466-1494 - Gracjan Góral, Emilia Wisnios, Piotr Sankowski, Pawel Budzianowski:

Wait, that's not an option: LLMs Robustness with Incorrect Multiple-Choice Options. 1495-1515 - Ameen Ali, Itamar Zimerman, Lior Wolf:

The Hidden Attention of Mamba Models. 1516-1534 - Luohe Shi, Zuchao Li, Lefei Zhang, Baoyuan Qi, Liu Guoming, Hai Zhao:

KV-Latent: Dimensional-level KV Cache Reduction with Frequency-aware Rotary Positional Embedding. 1535-1550 - Yan Wang, Ling Ding, Tien N. Nguyen, Shaohua Wang, Yanan Zheng:

LEANCODE: Understanding Models Better for Code Simplification of Pre-trained Large Language Models. 1551-1567 - Weiqi Wang, Yangqiu Song:

MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset. 1568-1596 - Hang Li, Tianlong Xu, Kaiqi Yang, Yucheng Chu, Yanling Chen, Yichi Song, Qingsong Wen, Hui Liu:

Ask-Before-Detection: Identifying and Mitigating Conformity Bias in LLM-Powered Error Detector for Math Word Problem Solutions. 1597-1609 - Sanxing Chen, Yukun Huang, Bhuwan Dhingra:

Real-time Factuality Assessment from Adversarial Feedback. 1610-1630 - Ruohong Zhang, Bowen Zhang, Yanghao Li, Haotian Zhang, Zhiqing Sun, Zhe Gan, Yinfei Yang, Ruoming Pang, Yiming Yang:

Improve Vision Language Model Chain-of-thought Reasoning. 1631-1662 - Haozhe An, Connor Baumler, Abhilasha Sancheti, Rachel Rudinger:

On the Mutual Influence of Gender and Occupation in LLM Representations. 1663-1680 - Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang:

Disentangling Memory and Reasoning Ability in Large Language Models. 1681-1701 - Jiaqi Li, Yanming Li, Xiaoli Shen, Chuanyi Zhang, Guilin Qi, Sheng Bi:

Open-World Attribute Mining for E-Commerce Products with Multimodal Self-Correction Instruction Tuning. 1702-1714 - Joakim Edin, Andreas Geert Motzfeldt, Casper L. Christensen, Tuukka Ruotsalo, Lars Maaløe, Maria Maistro:

Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attributions Explainability. 1715-1730 - Yuguang Yang, Yu Pan, Jixun Yao, Xiang Zhang, Jianhao Ye, Hongbin Zhou, Lei Xie, Lei Ma, Jianjun Zhao:

Takin-VC: Expressive Zero-Shot Voice Conversion via Adaptive Hybrid Content Encoding and Enhanced Timbre Modeling. 1731-1742 - Yihong Liu, Haotian Ye, Chunlan Ma, Mingyang Wang, Hinrich Schütze:

LangSAMP: Language-Script Aware Multilingual Pretraining. 1743-1770 - Haoyu Dong, Yue Hu, Huailiang Peng, Yanan Cao:

RelationalCoder: Rethinking Complex Tables via Programmatic Relational Transformation. 1771-1784 - Bolei Ma, Berk Yoztyurk, Anna-Carolina Haensch, Xinpeng Wang, Markus Herklotz, Frauke Kreuter, Barbara Plank, Matthias Aßenmacher:

Algorithmic Fidelity of Large Language Models in Generating Synthetic German Public Opinions: A Case Study. 1785-1809 - Fanheng Kong, Jingyuan Zhang, Hongzhi Zhang, Shi Feng, Daling Wang, Linhao Yu, Xingguang Ji, Yu Tian, Victoria W., Fuzheng Zhang:

TUNA: Comprehensive Fine-grained Temporal Understanding Evaluation on Dense Dynamic Videos. 1810-1839 - Zhuo Li, Yuhao Du, Jinpeng Hu, Xiang Wan, Anningzhe Gao:

Self-Instructed Derived Prompt Generation Meets In-Context Learning: Unlocking New Potential of Black-Box LLMs. 1840-1857 - Seungjae Jung, Gunsoo Han, Daniel Wontae Nam, Kyoung-Woon On:

Binary Classifier Optimization for Large Language Model Alignment. 1858-1872 - Md Nayem Uddin, Amir Saeidi, Divij Handa, Agastya Seth, Tran Cao Son, Eduardo Blanco, Steven R. Corman, Chitta Baral:

UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization. 1873-1913 - Yang Zhong, Diane J. Litman:

From Information to Insight: Leveraging LLMs for Open Aspect-Based Educational Summarization. 1914-1947 - Charles Nimo, Tobi Olatunji, Abraham Toluwase Owodunni, Tassallah Abdullahi, Emmanuel Ayodele, Mardhiyah Sanni, Ezinwanne C. Aka, Folafunmi Omofoye, Foutse Yuehgoh, Timothy Faniran, Bonaventure F. P. Dossou, Moshood O. Yekini, Jonas Kemp, Katherine A. Heller, Jude Chidubem Omeke, Chidi Asuzu MD, Naome A. Etori, Aimérou Ndiaye, Ifeoma Okoh, Evans Doe Ocansey, Wendy Kinara, Michael L. Best, Irfan Essa, Stephen Edward Moore, Chris Fourie, Mercy Nyamewaa Asiedu:

AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset. 1948-1973 - Xinyi Zeng, Yuying Shang, Jiawei Chen, Jingyuan Zhang, Yu Tian:

Root Defense Strategies: Ensuring Safety of LLM at the Decoding Level. 1974-1988 - Tianrui Pan, Jie Liu, Zewen Huang, Jie Tang, Gangshan Wu:

In-the-wild Audio Spatialization with Flexible Text-guided Localization. 1989-2001 - Hyesung Jeon, Yulhwa Kim, Jae-Joon Kim:

L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models. 2002-2024 - Jianqing Zhu, Huang Huang, Zhihang Lin, Juhao Liang, Zhengyang Tang, Khalid Almubarak, Mosen Alharthi, Bang An, Juncai He, Xiangbo Wu, Fei Yu, Junying Chen, Zhuoheng Ma, Yuhao Du, He Zhang, Saied Alshahrani, Emad A. Alghamdi, Lian Zhang, Ruoyu Sun, Haizhou Li, Benyou Wang, Jinchao Xu:

Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion. 2025-2042 - Sangyeop Kim, Yohan Lee, Yongwoo Song, Kimin Lee:

What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs. 2043-2063 - Tao Zhang, Zhenhua Tan:

ECERC: Evidence-Cause Attention Network for Multi-Modal Emotion Recognition in Conversation. 2064-2077 - Li Hu, Guoqiang Chen, Xiuwei Shang, Shaoyin Cheng, Benlong Wu, LiGangyang LiGangyang, Xu Zhu, Weiming Zhang, Nenghai Yu:

CompileAgent: Automated Real-World Repo-Level Compilation with Tool-Integrated LLM-based Agent System. 2078-2091 - Matthias Orlikowski, Jiaxin Pei, Paul Röttger, Philipp Cimiano, David Jurgens, Dirk Hovy:

Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals' Subjective Text Perceptions. 2092-2111 - Chonghua Liao, Ruobing Xie, Xingwu Sun, Haowen Sun, Zhanhui Kang:

Exploring Forgetting in Large Language Model Pre-Training. 2112-2127 - Virgile Rennard, Christos Xypolopoulos, Michalis Vazirgiannis:

Bias in the Mirror : Are LLMs opinions robust to their own adversarial attacks. 2128-2143 - Yifan Xu, Xiao Liu, Xueqiao Sun, Siyi Cheng, Hao Yu, Hanyu Lai, Shudan Zhang, Dan Zhang, Jie Tang, Yuxiao Dong:

AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents. 2144-2166 - Yongxin Huang, Kexin Wang, Goran Glavas, Iryna Gurevych:

Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment. 2167-2187 - Yijie Jin, Junjie Peng, Xuanchao Lin, Haochen Yuan, Lan Wang, Cangzhi Zheng:

Multimodal Transformers are Hierarchical Modal-wise Heterogeneous Graphs. 2188-2209 - Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Shaokai Chen, Mengshu Sun, Binbin Hu, Zhiqiang Zhang, Lei Liang, Wen Zhang, Huajun Chen:

Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking. 2210-2226 - Jan Pfister, Julia Wunderle, Andreas Hotho:

LLäMmlein: Transparent, Compact and Competitive German-Only Language Models from Scratch. 2227-2246 - Youngmin Kim, Jiwan Chung, Jisoo Kim, Sunghyun Lee, Sangkyu Lee, Junhyeok Kim, Cheoljong Yang, Youngjae Yu:

Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues. 2247-2265 - Simone Teglia, Simone Tedeschi, Roberto Navigli:

How Much Do Encoder Models Know About Word Senses? 2266-2277 - Huaizhi Ge, Yiming Li, Qifan Wang, Yongfeng Zhang, Ruixiang Tang:

When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations. 2278-2296 - Manuel Tonneau, Diyi Liu, Niyati Malhotra, Scott A. Hale, Samuel Fraiberger, Víctor Orozco-Olvera, Paul Röttger:

HateDay: Insights from a Global Hate Speech Dataset Representative of a Day on Twitter. 2297-2321 - Haitao Li, Junjie Chen, Jingli Yang, Qingyao Ai, Wei Jia, Youfeng Liu, Kai Lin, Yueyue Wu, Guozhi Yuan, Yiran Hu, Wuyue Wang, Yiqun Liu, Minlie Huang:

LegalAgentBench: Evaluating LLM Agents in Legal Domain. 2322-2344 - Peiqi Wang, ShengYun Peng, Xuewen Zhang, Hanchao Yu, Yibo Yang, Lifu Huang, Fujun Liu, Qifan Wang:

Inference Compute-Optimal Video Vision Language Models. 2345-2374 - Anirudh Sundar, Sinead Williamson, Katherine Metcalf, Barry-John Theobald, Skyler Seto, Masha Fedzechkina:

Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models. 2375-2401 - Amrit Poudel, Yifan Ding, Tim Weninger, Jürgen Pfeffer:

Digital Gatekeepers: Google's Role in Curating Hashtags and Subreddits. 2402-2415 - Anna Kolos, Katarzyna Lorenc, Emilia Wisnios, Agnieszka Karlinska:

Behind Closed Words: Creating and Investigating the forePLay Annotated Dataset for Polish Erotic Discourse. 2416-2432 - Maor Reuben, Ortal Slobodin, Idan-Chaim Cohen, Aviad Elyashar, Orna Braun-Lewensohn, Odeya Cohen, Rami Puzis:

Assessment and manipulation of latent constructs in pre-trained language models using psychometric scales. 2433-2444 - Ben Peters, André F. T. Martins:

Did Translation Models Get More Robust Without Anyone Even Noticing? 2445-2458 - Dan Su, Kezhi Kong, Ying Lin, Joseph Jennings, Brandon Norick, Markus Kliegl, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro:

Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset. 2459-2475 - Hans William Alexander Hanley, Zakir Durumeric:

Hierarchical Level-Wise News Article Clustering via Multilingual Matryoshka Embeddings. 2476-2492 - Tassilo Klein, Moin Nabi:

Contrastive Perplexity for Controlled Generation: An Application in Detoxifying Large Language Models. 2493-2508 - Haohang Li, Yupeng Cao, Yangyang Yu, Shashidhar Reddy Javaji, Zhiyang Deng, Yueru He, Yuechen Jiang, Zining Zhu, K. P. Subbalakshmi, Jimin Huang, Lingfei Qian, Xueqing Peng, Jordan W. Suchow, Qianqian Xie:

INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent. 2509-2525 - Benjamin Warner, Antoine Chaffin, Benjamin Clavié, Orion Weller, Oskar Hallström, Said Taghadouini, Alexis Gallagher, Raja Biswas, Faisal Ladhak, Tom Aarsen, Griffin Thomas Adams, Jeremy Howard, Iacopo Poli:

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference. 2526-2547 - Zhengyang Shan, Emily Diana, Jiawei Zhou:

Gender Inclusivity Fairness Index (GIFI): A Multilevel Framework for Evaluating Gender Diversity in Large Language Models. 2548-2579 - Qi Zhang, Zhiqing Xiao, Ruixuan Xiao, Lirong Gao, Junbo Zhao:

D.Va: Validate Your Demonstration First Before You Use It. 2580-2594 - Jiwan Chung, Janghan Yoon, Junhyeong Park, Sangeyl Lee, Joowon Yang, Sooyeon Park, Youngjae Yu:

Are Any-to-Any Models More Consistent Across Modality Transfers Than Specialists? 2595-2606 - Chia-Yuan Chang, Zhimeng Jiang, Vineeth Rakesh, Menghai Pan, Chin-Chia Michael Yeh, Guanchu Wang, Mingzhi Hu, Zhichao Xu, Yan Zheng, Mahashweta Das, Na Zou:

MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation. 2607-2622 - Hui Liu, Wenya Wang, Hao Sun, Chris Xing Tian, Chenqi Kong, Xin Dong, Haoliang Li:

Unraveling the Mechanics of Learning-Based Demonstration Selection for In-Context Learning. 2623-2641 - Yangkun Wang, Zihan Wang, Jingbo Shang:

Direct Prompt Optimization with Continuous Representations. 2642-2652 - Aishik Nagar, Yutong Liu, Andy T. Liu, Viktor Schlegel, Vijay Prakash Dwivedi, Arun-Kumar Kaliya-Perumal, Guna Pratheep Kalanchiam, Yili Tang, Robby T. Tan:

uMedSum: A Unified Framework for Clinical Abstractive Summarization. 2653-2672 - Yifan Yang, Zheshu Song, Jianheng Zhuo, Mingyu Cui, Jinpeng Li, Bo Yang, Yexing Du, Ziyang Ma, Xunying Liu, Ziyuan Wang, Ke Li, Shuai Fan, Kai Yu, Wei-Qiang Zhang, Guoguo Chen, Xie Chen:

GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement. 2673-2686 - Fanhang Man, Huandong Wang, Jianjie Fang, Zhaoyi Deng, Baining Zhao, Xinlei Chen, Yong Li:

Context-Aware Sentiment Forecasting via LLM-based Multi-Perspective Role-Playing Agents. 2687-2703 - Xiang Huang, Jiayu Shen, Shanshan Huang, Sitao Cheng, Xiaxia Wang, Yuzhong Qu:

TARGA: Targeted Synthetic Data Generation for Practical Reasoning over Structured Data. 2704-2726 - Hanyu Lai, Junjie Gao, Xiao Liu, Yifan Xu, Shudan Zhang, Yuxiao Dong, Jie Tang:

AndroidGen: Building an Android Language Agent under Data Scarcity. 2727-2749 - Mingxuan Xia, Haobo Wang, Yixuan Li, Zewei Yu, Jindong Wang, Junbo Zhao, Runze Wu:

Prompt Candidates, then Distill: A Teacher-Student Framework for LLM-driven Data Annotation. 2750-2770 - Hanyu Lai, Xiao Liu, Junjie Gao, Jiale Cheng, Zehan Qi, Yifan Xu, Shuntian Yao, Dan Zhang, Jinhua Du, Zhenyu Hou, Xin Lv, Minlie Huang, Yuxiao Dong, Jie Tang:

A Survey of Post-Training Scaling in Large Language Models. 2771-2791 - Tal Haklay, Hadas Orgad, David Bau, Aaron Mueller, Yonatan Belinkov:

Position-aware Automatic Circuit Discovery. 2792-2817 - Yuhuan Lu, Weijian Yu, Xin Jing, Dingqi Yang:

HyperFM: Fact-Centric Multimodal Fusion for Link Prediction over Hyper-Relational Knowledge Graphs. 2818-2830 - Gregor Geigle, Florian Schneider, Carolin Holtermann, Chris Biemann, Radu Timofte, Anne Lauscher, Goran Glavas:

Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model. 2831-2881 - Dimitris Gkoumas, Maria Liakata:

Less for More: Enhanced Feedback-aligned Mixed LLMs for Molecule Caption Generation and Fine-Grained NLI Evaluation. 2882-2902 - Georg Niess, Roman Kern:

Ensemble Watermarks for Large Language Models. 2903-2916 - Jiahui Geng, Thy Thy Tran, Preslav Nakov, Iryna Gurevych:

\mathsfCon Instruction: Universal Jailbreaking of Multimodal Large Language Models via Non-Textual Modalities. 2917-2933 - Cheng-Han Chiang, Hung-yi Lee, Michal Lukasik:

TRACT: Regression-Aware Fine-tuning Meets Chain-of-Thought Reasoning for LLM-as-a-Judge. 2934-2952 - Hanghui Guo, Jia Zhu, Shimin Di, Weijie Shi, Zhangze Chen, Jiajie Xu:

DioR: Adaptive Cognitive Detection and Contextual Retrieval Optimization for Dynamic Retrieval-Augmented Generation. 2953-2975 - Boxuan Lyu, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura:

Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation. 2976-2994 - Junjie Ye, Zhengyin Du, Xuesong Yao, Weijian Lin, Yufei Xu, Zehui Chen, Zaiyuan Wang, Sining Zhu, Zhiheng Xi, Siyu Yuan, Tao Gui, Qi Zhang, Xuanjing Huang, Jiecao Chen:

ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use. 2995-3021 - Zhili Liu, Yunhao Gou, Kai Chen, Lanqing Hong, Jiahui Gao, Fei Mi, Yu Zhang, Zhenguo Li, Xin Jiang, Qun Liu, James T. Kwok:

Mixture of insighTful Experts (MoTE): The Synergy of Reasoning Chains and Expert Mixtures in Self-Alignment. 3022-3038 - Weicong Qin, Yi Xu, Weijie Yu, Chenglei Shen, Ming He, Jianping Fan, Xiao Zhang, Jun Xu:

MAPS: Motivation-Aware Personalized Search via LLM-Driven Consultation Alignment. 3039-3051 - Jundong Xu, Hao Fei, Meng Luo, Qian Liu, Liangming Pan, William Yang Wang, Preslav Nakov, Mong-Li Lee, Wynne Hsu:

Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework. 3052-3075 - Jianghao Chen, Junhong Wu, Yangyifan Xu, Jiajun Zhang:

LADM: Long-context Training Data Selection with Attention-based Dependency Measurement for LLMs. 3076-3090 - Yuanfan Li, Zhaohan Zhang, Chengzhengxu Li, Chao Shen, Xiaoming Liu:

Iron Sharpens Iron: Defending Against Attacks in Machine-Generated Text Detection with Adversarial Training. 3091-3113 - Chen Cecilia Liu, Anna Korhonen, Iryna Gurevych:

Cultural Learning-Based Culture Adaptation of Language Models. 3114-3134 - Yuhan Zhou, Naoki Yoshinaga:

A-TASC: Asian TED-Based Automatic Subtitling Corpus. 3135-3148 - Youliang Yuan, Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Jiahao Xu, Tian Liang, Pinjia He, Zhaopeng Tu:

Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training. 3149-3167 - Yuchen Fu, Zifeng Cheng, Zhiwei Jiang, Zhonghui Wang, Yafeng Yin, Zhengliang Li, Qing Gu:

Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs. 3168-3181 - Neha Srikanth, Rachel Rudinger, Jordan Lee Boyd-Graber:

No Questions are Stupid, but some are Poorly Posed: Understanding Poorly-Posed Information-Seeking Questions. 3182-3199 - Rupak Sarkar, Neha Srikanth, Taylor Pellegrin, Rachel Rudinger, Claire Bonial, Philip Resnik:

Understanding Common Ground Misalignment in Goal-Oriented Dialog: A Case-Study with Ubuntu Chat Logs. 3200-3215 - Olga Loginova, Oleksandr Bezrukov, Ravi Shekhar, Alexey Kravets:

Addressing Blind Guessing: Calibration of Selection Bias in Multiple-Choice Question Answering by Video Language Models. 3216-3246 - Sheng Ouyang, Yulan Hu, Ge Chen, Qingyang Li, Fuzheng Zhang, Yong Liu:

Towards Reward Fairness in RLHF: From a Resource Allocation Perspective. 3247-3259 - Siyuan Li, Juanxi Tian, Zedong Wang, Xin Jin, Zicheng Liu, Wentao Zhang, Dan Xu:

Taming LLMs with Gradient Grouping. 3260-3279 - Sukannya Purkayastha, Zhuang Li

, Anne Lauscher, Lizhen Qu, Iryna Gurevych:
LazyReview: A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews. 3280-3308 - Amr Keleg, Sharon Goldwater, Walid Magdy:

Revisiting Common Assumptions about Arabic Dialects in NLP. 3309-3327 - Ravi Patel, Angus Brayne, Rogier Hintzen, Daniel Jaroslawicz, Georgiana Neculae, Dane S. Corneil:

Retrieve to Explain: Evidence-driven Predictions for Explainable Drug Target Identification. 3328-3370 - Nishant Balepur, Vishakh Padmakumar, Fumeng Yang, Shi Feng, Rachel Rudinger, Jordan Lee Boyd-Graber:

Whose Boat Does it Float? Improving Personalization in Preference Tuning via Inferred User Personas. 3371-3393 - Nishant Balepur, Rachel Rudinger, Jordan Lee Boyd-Graber:

Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above. 3394-3418 - Muhammad Zain Ali, Yuxia Wang, Bernhard Pfahringer, Tony C. Smith:

Detection of Human and Machine-Authored Fake News in Urdu. 3419-3428 - Yangyang Zhao, Ben Niu, Libo Qin, Shihan Wang:

An Efficient Task-Oriented Dialogue Policy: Evolutionary Reinforcement Learning Injected by Elite Individuals. 3429-3442 - Jiahuan Zhang, Tianheng Wang, Ziyi Huang, Yulong Wu, Hanqing Wu, DongbaiChen DongbaiChen, Linfeng Song, Yue Zhang, Guozheng Rao, Kaicheng Yu:

SR-LLM: Rethinking the Structured Representation in Large Language Model. 3443-3462 - Chuang Zhou, Zhu Wang, Shengyuan Chen, Jiahe Du, Qiyuan Zheng, Zhaozhuo Xu, Xiao Huang:

Taming Language Models for Text-attributed Graph Learning with Decoupled Aggregation. 3463-3474 - Zifeng Cheng, Zhonghui Wang, Yuchen Fu, Zhiwei Jiang, Yafeng Yin, Cong Wang, Qing Gu:

Contrastive Prompting Enhances Sentence Embeddings in LLMs through Inference-Time Steering. 3475-3487 - Jinghan He, Kuan Zhu, Haiyun Guo, Junfeng Fang, Zhenglin Hua, Yuheng Jia, Ming Tang, Tat-Seng Chua, Jinqiao Wang:

Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence. 3488-3501 - Jiajie Jin, Xiaoxi Li, Guanting Dong, Yuyao Zhang, Yutao Zhu, Yongkang Wu, Zhonghua Li, Ye Qi, Zhicheng Dou:

Hierarchical Document Refinement for Long-context Retrieval-augmented Generation. 3502-3520 - Chaoyi Xiang, Chunhua Liu, Simon De Deyne, Lea Frermann:

Comparing Moral Values in Western English-speaking societies and LLMs with Word Associations. 3521-3536 - Yuting Wei, Qi Meng, Yuanxing Xu, Bin Wu:

TEACH: A Contrastive Knowledge Adaptive Distillation Framework for Classical Chinese Understanding. 3537-3550 - Guanting Dong, Jiajie Jin, Xiaoxi Li, Yutao Zhu, Zhicheng Dou, Ji-Rong Wen:

RAG-Critic: Leveraging Automated Critic-Guided Agentic Workflow for Retrieval Augmented Generation. 3551-3578 - Guanting Dong, Chenghao Zhang, Mengjie Deng, Yutao Zhu, Zhicheng Dou, Ji-Rong Wen:

Progressive Multimodal Reasoning via Active Retrieval. 3579-3602 - Hao Peng, Xin Lv, Yushi Bai, Zijun Yao, Jiajie Zhang, Lei Hou, Juanzi Li:

Pre-training Distillation for Large Language Models: A Design Space Exploration. 3603-3618 - Pu Jian, Donglei Yu, Wen Yang, Shuo Ren, Jiajun Zhang:

Teaching Vision-Language Models to Ask: Resolving Ambiguity in Visual Questions. 3619-3638 - Yushi Bai, Shangqing Tu, Jiajie Zhang, Hao Peng, Xiaozhi Wang, Xin Lv, Shulin Cao, Jiazheng Xu, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li:

LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks. 3639-3664 - Haiyang Wang, Zhiliang Tian, Yuchen Pan, Xin Song, Xin Niu, Minlie Huang, Bin Zhou:

Battling against Tough Resister: Strategy Planning with Adversarial Game for Non-collaborative Dialogues. 3665-3685 - Youcheng Huang, Chen Huang, Duanyu Feng, Wenqiang Lei, Jiancheng Lv:

Cross-model Transferability among Large Language Models on the Platonic Representations of Concepts. 3686-3704 - Guichao Zhu, Lintian Lei, Yuhao Qing, Yichao Fu, Fanxin Li, Dong Huang, Zekai Sun, Heming Cui:

FoldMoE: Efficient Long Sequence MoE Training via Attention-MoE Pipelining. 3705-3717 - Jiajie Zhang, Zhongni Hou, Xin Lv, Shulin Cao, Zhenyu Hou, Yilin Niu, Lei Hou, Yuxiao Dong, Ling Feng, Juanzi Li:

LongReward: Improving Long-context Large Language Models with AI Feedback. 3718-3739 - Yuxi Xia, Pedro Henrique Luz de Araujo, Klim Zaporojets, Benjamin Roth:

Influences on LLM Calibration: A Study of Response Agreement, Loss Functions, and Prompt Styles. 3740-3761 - Boxi Yu, Yuxuan Zhu, Pinjia He, Daniel Kang:

UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench. 3762-3774 - Lekang Jiang, Pascal A. Scherz, Stefan Goetz:

Towards Better Evaluation for Generated Patent Claims. 3775-3788 - Haritz Puerto, Tilek Chubakov, Xiaodan Zhu, Harish Tayyar Madabushi, Iryna Gurevych:

Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs. 3789-3808 - Kejian Zhu, Shangqing Tu, Zhuoran Jin, Lei Hou, Juanzi Li, Jun Zhao:

Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis. 3809-3822 - Yanzhu Guo, Simone Conia, Zelin Zhou, Min Li, Saloni Potdar, Henry Xiao:

Do Large Language Models have an English Accent? Evaluating and Improving the Naturalness of Multilingual LLMs. 3823-3838 - Zhu Xu, Zhiqiang Zhao, Zihan Zhang, Yuchi Liu, Quanwei Shen, Fei Liu, Yu Kuang, Jian He, Conglin Liu:

Enhancing Character-Level Understanding in LLMs through Token Internal Structure Learning. 3839-3853 - Xiaochen Zhu, Caiqi Zhang, Tom Stafford, Nigel Collier, Andreas Vlachos:

Conformity in Large Language Models. 3854-3872 - Chenghao Sun, Zhen Huang, Yonggang Zhang, Le Lu, Houqiang Li, Xinmei Tian, Xu Shen, Jieping Ye:

Interpret and Improve In-Context Learning via the Lens of Input-Label Mappings. 3873-3895 - Lukas Kinder, Lukas Edman, Alexander Fraser, Tobias Käfer:

Positional Overload: Positional Debiasing and Context Window Extension for Large Language Models using Set Encoding. 3896-3908 - Weilin Zhao, Tengyu Pan, Xu Han, Yudi Zhang, Sun Ao, Yuxiang Huang, Kaihuo Zhang, Weilun Zhao, Yuxuan Li, Jie Zhou, Hao Zhou, Jianyong Wang, Maosong Sun, Zhiyuan Liu:

FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling. 3909-3921 - Congzhi Zhang, Jiawei Peng, Zhenglin Wang, Yilong Lai, Haowen Sun, Heng Chang, Fei Ma, Weijiang Yu:

VReST: Enhancing Reasoning in Large Vision-Language Models through Tree Search and Self-Reward Mechanism. 3922-3941 - Nianqi Li, Siyu Yuan, Jiangjie Chen, Jiaqing Liang, Feng Wei, Zujie Liang, Deqing Yang, Yanghua Xiao:

Past Meets Present: Creating Historical Analogy with Large Language Models. 3942-3957 - Yaoke Wang, Yun Zhu, XintongBao XintongBao, Wenqiao Zhang, Suyang Dai, Kehan Chen, Wenqiang Li, Gang Huang, Siliang Tang, Yueting Zhuang:

Meta-Reflection: A Feedback-Free Reflection Learning Framework. 3958-3976 - Chen Zhang, Jiuheng Lin, Xiao Liu, Zekai Zhang, Yansong Feng:

Read it in Two Steps: Translating Extremely Low-Resource Languages with Code-Augmented Grammar Books. 3977-3997 - Zhe Yang, Yichang Zhang, Yudong Wang, Ziyao Xu, Junyang Lin, Zhifang Sui:

Confidence v.s. Critique: A Decomposition of Self-Correction Capability for LLMs. 3998-4014 - Kangcheng Luo, Quzhe Huang, Cong Jiang, Yansong Feng:

Automating Legal Interpretation with LLMs: Retrieval, Generation, and Evaluation. 4015-4047 - Wei Li, Zhen Huang, Houqiang Li, Le Lu, Yang Lu, Xinmei Tian, Xu Shen, Jieping Ye:

Visual Evidence Prompting Mitigates Hallucinations in Large Vision-Language Models. 4048-4080 - Shao Zhang, Xihuai Wang, Wenhao Zhang, Chaoran Li, Junru Song, Tingyu Li, Lin Qiu, Xuezhi Cao, Xunliang Cai, Wen Yao, Weinan Zhang, Xinbing Wang, Ying Wen:

Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration. 4081-4108 - Chong Li, Jiajun Zhang, Chengqing Zong:

TokAlign: Efficient Vocabulary Adaptation via Token Alignment. 4109-4126 - Qi Li, Xiaowen Chu:

AdaEdit: Advancing Continuous Knowledge Editing For Large Language Models. 4127-4149 - Byung-Doh Oh, William Schuler:

The Impact of Token Granularity on the Predictive Power of Language Model Surprisal. 4150-4162 - Xiaochen Zhu, Georgi Karadzhov, Chenxi Whitehouse, Andreas Vlachos:

Segment-Level Diffusion: A Framework for Controllable Long-Form Generation with Diffusion Language Models. 4163-4183 - Taolin Zhang, Dongyang Li, Qizhou Chen, Chengyu Wang, Xiaofeng He:

BELLE: A Bi-Level Multi-Agent Reasoning Framework for Multi-Hop Question Answering. 4184-4202 - Zhangyue Yin, Qiushi Sun, Zhiyuan Zeng, Qinyuan Cheng, Xipeng Qiu, Xuanjing Huang:

Dynamic and Generalizable Process Reward Modeling. 4203-4233 - Zixin Chen, Hongzhan Lin, Kaixin Li, Ziyang Luo, Zhen Ye, Guang Chen, Zhiyong Huang, Jing Ma:

AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness. 4234-4253 - Xin Zhang, Ziqi Dai, Yongqi Li, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Jun Yu, Wenjie Li, Min Zhang:

Towards Text-Image Interleaved Retrieval. 4254-4269 - Guangcheng Zhu, Ruixuan Xiao, Haobo Wang, Zhen Zhu, Gengyu Lyu, Junbo Zhao:

Large Margin Representation Learning for Robust Cross-lingual Named Entity Recognition. 4270-4291 - Wei Sun, Qianlong Du, Fuwei Cui, Jiajun Zhang:

An Efficient and Precise Training Data Construction Framework for Process-supervised Reward Model in Mathematical Reasoning. 4292-4305 - Zhengren Wang, Qinhan Yu, Shida Wei, Zhiyu Li, Feiyu Xiong, Xiaoxing Wang, Simin Niu, Hao Liang, Wentao Zhang:

QAEncoder: Towards Aligned Representation Learning in Question Answering Systems. 4306-4332 - Jiale Hong, Hongqiu Wu, Hai Zhao:

Game Development as Human-LLM Interaction. 4333-4354 - Rena Wei Gao, Xuetong Wu, Tatsuki Kuribayashi, Mingrui Ye, Siya Qi, Carsten Roever, Yuanxing Liu, Zheng Yuan, Jey Han Lau:

Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases. 4355-4379 - Zhuoqun Li, Haiyang Yu, Xuanang Chen, Hongyu Lin, Yaojie Lu, Fei Huang, Xianpei Han, Yongbin Li, Le Sun:

DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking. 4380-4396 - Viet Thanh Pham, Lizhen Qu, Zhuang Li, Suraj Sharma, Gholamreza Haffari:

SurveyPilot: an Agentic Framework for Automated Human Opinion Collection from Social Media. 4397-4422 - Daoze Zhang, Yuze Zhao, Jintao Huang, Yingda Chen:

Sharper and Faster mean Better: Towards More Efficient Vision-Language Model for Hour-scale Long Video Understanding. 4423-4439 - Ruochen Zhao, Wenxuan Zhang, Yew Ken Chia, Weiwen Xu, Deli Zhao, Lidong Bing:

Auto-Arena: Automating LLM Evaluations with Agent Peer Battles and Committee Discussions. 4440-4463 - Andrea Pedrotti, Giulia Rambelli, Caterina Villani, Marianna Bolognesi:

How Humans and LLMs Organize Conceptual Knowledge: Exploring Subordinate Categories in Italian. 4464-4482 - Jiaqi Zhao, Miao Zhang, Ming Wang, Yuzhang Shang, Kaihao Zhang, Weili Guan, Yaowei Wang, Min Zhang:

PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models. 4483-4502 - Bowen Wei, Ziwei Zhu:

ProtoLens: Advancing Prototype Learning for Fine-Grained Interpretability in Text Classification. 4503-4523 - Chaoqun Cui, Liangbin Huang, Shijing Wang, Zhe Tong, Zhaolong Huang, Xiao Zeng, Xiaofeng Liu:

Fine-grained Video Dubbing Duration Alignment with Segment Supervised Preference Optimization. 4524-4546 - Chunlei Xin, Shuheng Zhou, Huijia Zhu, Weiqiang Wang, Xuanang Chen, Xinyan Guan, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun:

Sparse Latents Steer Retrieval-Augmented Generation. 4547-4562 - Boyi Deng, Yu Wan, Baosong Yang, Yidan Zhang, Fuli Feng:

Unveiling Language-Specific Features in Large Language Models via Sparse Autoencoders. 4563-4608 - Xun Liang, Simin Niu, Zhiyu Li, Sensen Zhang, Hanyu Wang, Feiyu Xiong, Jason Zhaoxin Fan, Bo Tang, Jihao Zhao, Jiawei Yang, Shichao Song, Mengwei Wang:

SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model. 4609-4631 - Guo Tang, Zheng Chu, Wenxiang Zheng, Junjia Xiang, Yizhuo Li, Weihao Zhang, Ming Liu, Bing Qin:

AnRe: Analogical Replay for Temporal Knowledge Graph Forecasting. 4632-4650 - Zhiyuan Zeng, Qinyuan Cheng, Zhangyue Yin, Yunhua Zhou, Xipeng Qiu:

Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities? 4651-4665 - Zitai Qiu, Congbo Ma, Jia Wu, Jian Yang:

Text is All You Need: LLM-enhanced Incremental Social Event Detection. 4666-4680 - Tong Liu, Zhixin Lai, Jiawen Wang, Gengyuan Zhang, Shuo Chen, Philip Torr, Vera Demberg, Volker Tresp, Jindong Gu:

Multimodal Pragmatic Jailbreak on Text-to-image Models. 4681-4720 - Xingcheng Xu, Zibo Zhao, Haipeng Zhang, Yanqing Yang:

Principled Understanding of Generalization for Generative Transformer Models in Arithmetic Reasoning Tasks. 4721-4747 - Wei Liu, Michael Strube:

Discourse Relation-Enhanced Neural Coherence Modeling. 4748-4762 - Kuofeng Gao, Shutao Xia, Ke Xu, Philip Torr, Jindong Gu:

Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models. 4763-4784 - Yu Yan, Sheng Sun, Zenghao Duan, Teli Liu, Min Liu, Zhiyi Yin, LeiJingyu LeiJingyu, Qi Li:

from Benign import Toxic: Jailbreaking the Language Model via Adversarial Metaphors. 4785-4817 - Hengyuan Zhang, Chenming Shang, Sizhe Wang, Dongdong Zhang, Yiyao Yu, Feng Yao, Renliang Sun, Yujiu Yang, Furu Wei:

ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Multilingual Contrastive Framework. 4818-4841 - Zongqi Wang, Tianle Gu, Baoyuan Wu, Yujiu Yang:

MorphMark: Flexible Adaptive Watermarking for Large Language Models. 4842-4860 - Chenlong Deng, Zhisong Zhang, Kelong Mao, Shuaiyi Li, Xinting Huang, Dong Yu, Zhicheng Dou:

A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression. 4861-4879 - Cassie Huang, Li Zhang:

On the Limit of Language Models as Planning Formalizers. 4880-4904 - Yaxi Lu, Haolun Li, Xin Cong, Zhong Zhang, Yesai Wu, Yankai Lin, Zhiyuan Liu, Fangming Liu, Maosong Sun:

Learning to Generate Structured Output with Schema Reinforcement Learning. 4905-4918 - Peichao Lai, Zhengfeng Zhang, Wentao Zhang, Fangcheng Fu, Bin Cui:

Enhancing Unsupervised Sentence Embeddings via Knowledge-Driven Data Augmentation and Gaussian-Decayed Contrastive Learning. 4919-4940 - Peijian Gu, Quan Wang, Zhendong Mao:

Improve Safety Training of Large Language Models with Safety-Critical Singular Vectors Localization. 4941-4954 - Huawen Feng, Pu Zhao, Qingfeng Sun, Can Xu, Fangkai Yang, Lu Wang, Qianli Ma, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang:

WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models. 4955-4969 - Junqing Gong, Binhan Yang, Wei Shen:

A Triple-View Framework for Fine-Grained Emotion Classification with Clustering-Guided Contrastive Learning. 4970-4984 - Sunbowen Lee, Junting Zhou, Chang Ao, Kaige Li, Xeron Du, Sirui He, Haihong Wu, Tianci Liu, Jiaheng Liu, Hamid Alinejad-Rokny, Min Yang, Yitao Liang, Zhoufutu Wen, Shiwen Ni:

Quantification of Large Language Model Distillation. 4985-5004 - Zihan Qiu, Zeyu Huang, Bo Zheng, Kaiyue Wen, Zekun Wang, Rui Men, Ivan Titov, Dayiheng Liu, Jingren Zhou, Junyang Lin:

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models. 5005-5018 - Jinyang Wu, Shuai Zhang, Feihu Che, Mingkuan Feng, Pengpeng Shao, Jianhua Tao:

Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models. 5019-5039 - Jingyu Peng, Maolin Wang, Xiangyu Zhao, Kai Zhang, Wanyu Wang, Pengyue Jia, Qidong Liu, Ruocheng Guo, Qi Liu:

Stepwise Reasoning Disruption Attack of LLMs. 5040-5058 - Qiyuan Zhang, Yufei Wang, Yuxin Jiang, Liangyou Li, Chuhan Wu, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Fuyuan Lyu, Chen Ma:

Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge. 5059-5074 - Mingyang Wang, Heike Adel, Lukas Lange, Yihong Liu, Ercong Nie, Jannik Strötgen, Hinrich Schütze:

Lost in Multilinguality: Dissecting Cross-lingual Factual Inconsistency in Transformer Language Models. 5075-5094 - Yining Lu, Noah Ziems, Hy Dang, Meng Jiang:

Optimizing Decomposition for Optimal Claim Verification. 5095-5114 - Kai Yao, Zhaorui Tan, Penglei Gao, Lichun Li, Kaixin Wu, Yinggui Wang, Yuan Zhao, Yixin Ji, Jianke Zhu, Wei Wang:

GradOT: Training-free Gradient-preserving Offsite-tuning for Large Language Models. 5115-5130 - Moxin Li, Yong Zhao, Wenxuan Zhang, Shuaiyi Li, Wenya Xie, See-Kiong Ng, Tat-Seng Chua, Yang Deng:

Knowledge Boundary of Large Language Models: A Survey. 5131-5157 - Hai-Long Sun, Zhun Sun, Houwen Peng, Han-Jia Ye:

Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning. 5158-5171 - Jihao Zhao, Zhiyuan Ji, Zhaoxin Fan, Hanyu Wang, Simin Niu, Bo Tang, Feiyu Xiong, Zhiyu Li:

MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System. 5172-5189 - Hyeong Kyu Choi, Weijie Xu, Chi Xue, Stephanie Eckman, Chandan K. Reddy:

Mitigating Selection Bias with Node Pruning and Auxiliary Options. 5190-5215 - Luhao Zhang, Xinyu Zhang, Linmei Hu, Dandan Song, Liqiang Nie:

Dually Self-Improved Counterfactual Data Augmentation Using Large Language Model. 5216-5227 - Shi-Qi Yan, Quan Liu, Zhen-Hua Ling:

RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation. 5228-5240 - Yanyang Li, Michael R. Lyu, Liwei Wang:

Learning to Reason from Feedback at Test-Time. 5241-5253 - Zecheng Tang, Keyan Zhou, Juntao Li, Baibei Ji, Jianye Hou, Min Zhang:

L-CiteEval: A Suite for Evaluating Fidelity of Long-context Models. 5254-5277 - Trisha Das, Afrah Shafquat, Mandis Beigi, Jacob Aptekar, Jimeng Sun:

SECRET: Semi-supervised Clinical Trial Document Similarity Search. 5278-5291 - Jin Hwa Lee, Thomas Jiralerspong, Lei Yu, Yoshua Bengio, Emily Cheng:

Geometric Signatures of Compositionality Across a Language Model's Lifetime. 5292-5320 - Maxime Griot, Jean Vanderdonckt, Demet Yüksel, Coralie Hemptinne:

Pattern Recognition or Medical Knowledge? The Problem with Multiple-Choice Questions in Medicine. 5321-5341 - Jenna Russell, Marzena Karpinska, Mohit Iyyer:

People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text. 5342-5373 - Yiwen Hu, Huatong Song, Jie Chen, Jia Deng, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Zican Dong, Yang Lu, Xu Miao, Xin Zhao, Ji-Rong Wen:

YuLan-Mini: Pushing the Limits of Open Data-efficient Language Model. 5374-5400 - Timothee Mickus, Aman Sinha, Raúl Vázquez:

Your Model is Overconfident, and Other Lies We Tell Ourselves. 5401-5417 - Weixuan Wang, Minghao Wu, Barry Haddow, Alexandra Birch:

Bridging the Language Gaps in Large Language Models with Inference-Time Cross-Lingual Intervention. 5418-5433 - Kyeonghyun Kim, Jinhee Jang, Juhwan Choi, Yoonji Lee, Kyohoon Jin, YoungBin Kim:

Plug-in and Fine-tuning: Bridging the Gap between Small Language Models and Large Language Models. 5434-5452 - Han Meng, Yancan Chen, Yunan Li, Yitian Yang, Jungup Lee, Renwen Zhang, Yi-Chieh Lee:

What is Stigma Attributed to? A Theory-Grounded, Expert-Annotated Interview Corpus for Demystifying Mental-Health Stigma. 5453-5490 - Yuguo Yin, Yuxin Xie, Wenyuan Yang, Dongchao Yang, Jinghan Ru, Xianwei Zhuang, Liming Liang, Yuexian Zou:

ATRI: Mitigating Multilingual Audio Text Retrieval Inconsistencies by Reducing Data Distribution Errors. 5491-5504 - Tianshi Zheng, Jiazheng Wang, Zihao Wang, Jiaxin Bai, Hang Yin, Zheye Deng, Yangqiu Song, Jianxin Li:

Enhancing Transformers for Generalizable First-Order Logical Entailment. 5505-5524 - Yufan Zhuang, Xiaodong Yu, Jialian Wu, Ximeng Sun, Ze Wang, Jiang Liu, Yusheng Su, Jingbo Shang, Zicheng Liu, Emad Barsoum:

Self-Taught Agentic Long Context Understanding. 5525-5537 - Shahrad Mohammadzadeh, Juan David Guerra, Marco Bonizzato, Reihaneh Rabbany, Golnoosh Farnadi:

Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training. 5538-5554 - Qiushi Sun, Kanzhi Cheng, Zichen Ding, Chuanyang Jin, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chengyou Jia, Liheng Chen, Zhoumianze Liu, Ben Kao, Guohao Li, Junxian He, Yu Qiao, Zhiyong Wu:

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis. 5555-5579 - Yepeng Weng, Dianwen Mei, Huishi Qiu, Xujie Chen, Li Liu, Jiang Tian, Zhongchao Shi:

CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter. 5580-5593 - Antonin Poché, Alon Jacovi, Agustin Martin Picard, Victor Boutin, Fanny Jourdan:

ConSim: Measuring Concept-Based Explanations' Effectiveness with Automated Simulatability. 5594-5615 - Omer Shubi, Cfir Avraham Hadar, Yevgeni Berzak:

Decoding Reading Goals from Eye Movements. 5616-5637 - Si Wu, Sebastian Bruch:

Uncovering Visual-Semantic Psycholinguistic Properties from the Distributional Structure of Text Embedding Space. 5638-5649 - Bin Xie, Rui Shao, Gongwei Chen, Kaiwen Zhou, Yinchuan Li, Jie Liu, Min Zhang, Liqiang Nie:

GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent. 5650-5667 - Xiaodong Chen, Yuxuan Hu, Xiaokang Zhang, Yanling Wang, Cuiping Li, Hong Chen, Jing Zhang:

P² Law: Scaling Law for Post-Training After Model Pruning. 5668-5686 - Kuleen Sasse, Carlos Alejandro Aguirre, Isabel Cachola, Sharon Levy, Mark Dredze:

Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats. 5687-5709 - Shihan Dou, Jiayi Chen, Chenhao Huang, Feng Chen, Wei Chengzhi, Huiyuan Zheng, Shichun Liu, Yan Liu, Chenxiao Liu, Chao Xin, Lin Yan, Zongzhang Zhang, Tao Gui, Qi Zhang, Xuanjing Huang:

Lost in the Context: Insufficient and Distracted Attention to Contexts in Preference Modeling. 5710-5728 - Jinu Lee, Qi Liu, Runzhi Ma, Vincent Han, Ziqi Wang, Heng Ji, Julia Hockenmaier:

Entailment-Preserving First-order Logic Representations in Natural Language Entailment. 5729-5742 - Duzhen Zhang, Yong Ren, Zhong-Zhi Li, Yahan Yu, Jiahua Dong, Chenxing Li, Zhilong Ji, Jinfeng Bai:

Enhancing Multimodal Continual Instruction Tuning with BranchLoRA. 5743-5756 - Yoav Gur-Arieh, Roy Mayan

, Chen Agassy
, Atticus Geiger, Mor Geva:
Enhancing Automated Interpretability with Output-Centric Feature Descriptions. 5757-5778 - Jie Chen, Zhipeng Chen, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Yingqian Min, Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ji-Rong Wen:

Towards Effective and Efficient Continual Pre-training of Large Language Models. 5779-5795 - Yihao Huang, Chong Wang, Xiaojun Jia, Qing Guo, Felix Juefei-Xu, Jian Zhang, Yang Liu, Geguang Pu:

Efficient Universal Goal Hijacking with Semantics-guided Prompt Organization. 5796-5816 - Anwen Hu, Haiyang Xu, Liang Zhang, Jiabo Ye, Ming Yan, Ji Zhang, Qin Jin, Fei Huang, Jingren Zhou:

mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding. 5817-5834 - Do Xuan Long, Duy Dinh, Ngoc-Hai Nguyen, Kenji Kawaguchi, Nancy F. Chen, Shafiq Joty, Min-Yen Kan:

What Makes a Good Natural Language Prompt? 5835-5873 - Weiqi Wu, Hongqiu Wu, Hai Zhao:

X-TURING: Towards an Enhanced and Efficient Turing Test for Long-Term Dialogue Agents. 5874-5889 - Shivani Kumar, David Jurgens:

Are Rules Meant to be Broken? Understanding Multilingual Moral Reasoning as a Computational Pipeline with UniMoral. 5890-5912 - Zheyuan Liu, Guangyao Dou, Xiangchi Yuan, Chunhui Zhang, Zhaoxuan Tan, Meng Jiang:

Modality-Aware Neuron Pruning for Unlearning in Multimodal Large Language Models. 5913-5933 - Zheyuan Zhang, Yiyang Li, Nhi Ha Lan Le, Zehong Wang, Tianyi Ma, Vincent Galassi, Keerthiram Murugesan, Nuno Moniz, Werner Geyer, Nitesh V. Chawla, Chuxu Zhang, Yanfang Ye:

NGQA: A Nutritional Graph Question Answering Benchmark for Personalized Health-aware Nutritional Reasoning. 5934-5966 - Haoming Xu, Ningyuan Zhao, Liming Yang, Sendong Zhao, Shumin Deng, Mengru Wang, Bryan Hooi, Nay Oo, Huajun Chen, Ningyu Zhang:

ReLearn: Unlearning via Learning for Large Language Models. 5967-5987 - Pritom Saha Akash, Kevin Chen-Chuan Chang:

Understanding Cross-Domain Adaptation in Low-Resource Topic Modeling. 5988-6001 - Boyang Xue, Fei Mi, Qi Zhu, Hongru Wang, Rui Wang, Sheng Wang, Erxin Yu, Xuming Hu, Kam-Fai Wong:

UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models. 6002-6024 - Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, Xinchao Wang:

CoT-Valve: Length-Compressible Chain-of-Thought Tuning. 6025-6035 - Jie Ouyang, Tingyue Pan, Mingyue Cheng, Ruiran Yan, Yucong Luo, Jiaying Lin, Qi Liu:

HoH: A Dynamic Benchmark for Evaluating the Impact of Outdated Information on Retrieval-Augmented Generation. 6036-6063 - Qiwei Zhao, Dong Li, Yanchi Liu, Wei Cheng, Yiyou Sun, Mika Oishi, Takao Osaki, Katsushi Matsuda, Huaxiu Yao, Chen Zhao, Haifeng Chen, Xujiang Zhao:

Uncertainty Propagation on LLM Agent. 6064-6073 - Valeria Ruscio, Umberto Nanni, Fabrizio Silvestri:

Beyond Position: the emergence of wavelet-like properties in Transformers. 6074-6088 - Giovanni Servedio, Alessandro De Bellis, Dario Di Palma, Vito Walter Anelli, Tommaso Di Noia:

Are the Hidden States Hiding Something? Testing the Limits of Factuality-Encoding Capabilities in LLMs. 6089-6104 - Zheyuan Liu, Suraj Maharjan, Fanyou Wu, Rahil Parikh, Belhassen Bayar, Srinivasan H. Sengamedu, Meng Jiang:

Disentangling Biased Knowledge from Reasoning in Large Language Models via Machine Unlearning. 6105-6123 - Dario Di Palma, Alessandro De Bellis, Giovanni Servedio, Vito Walter Anelli, Fedelucio Narducci, Tommaso Di Noia:

LLaMAs Have Feelings Too: Unveiling Sentiment and Emotion Representations in LLaMA Models Through Probing. 6124-6142 - Yayu Cao, Tianxiang Wang, Lvxiaowei Xu, Zhenyao Wang, Ming Cai:

CxGGEC: Construction-Guided Grammatical Error Correction. 6143-6156 - Xiangyu Zhang, Yu Zhou, Guang Yang, Wei Cheng, Taolue Chen:

Beyond Sequences: Two-dimensional Representation and Dependency Encoding for Code Generation. 6157-6172 - Qing Li, Jiahui Geng, Zongxiong Chen, Derui Zhu, Yuxia Wang, Congbo Ma, Chenyang Lyu, Fakhri Karray:

HD-NDEs: Neural Differential Equations for Hallucination Detection in LLMs. 6173-6186 - Dongqi Liu, Chenxi Whitehouse, Xi Yu, Louis Mahon, Rohit Saxena, Zheng Zhao, Yifu Qiu, Mirella Lapata, Vera Demberg:

What Is That Talk About? A Video-to-Text Summarization Dataset for Scientific Presentations. 6187-6210 - Ruisheng Cao, Hanchong Zhang, Tiancheng Huang, Zhangyi Kang, Yuxin Zhang, Liangtai Sun, Hanqi Li, Yuxun Miao, Shuai Fan, Lu Chen, Kai Yu:

NeuSym-RAG: Hybrid Neural Symbolic Retrieval with Multiview Structuring for PDF Question Answering. 6211-6239 - Xiuxuan Shen, Zhongyuan Jiang, Junsan Zhang, Junxiao Han, Yao Wan, Chengjie Guo, Bingcheng Liu, Jie Wu, Renxiang Li, Philip S. Yu:

ProvBench: A Benchmark of Legal Provision Recommendation for Contract Auto-Reviewing. 6240-6254 - Yushen Chen, Zhikang Niu, Ziyang Ma, Keqi Deng, Chunhui Wang, Jian Zhao, Kai Yu, Xie Chen:

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching. 6255-6271 - Xiechi Zhang, Zetian Ouyang, Linlin Wang, Gerard de Melo, Zhu Cao, Xiaoling Wang, Ya Zhang, Yanfeng Wang, Liang He:

AutoMedEval: Harnessing Language Models for Automatic Medical Capability Evaluation. 6272-6285 - Bohan Zhang, Xiaokang Zhang, Jing Zhang, Jifan Yu, Sijia Luo, Jie Tang:

CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis. 6286-6303 - Xuandong Zhao, Chenwen Liao, Yuxiang Wang, Lei Li:

Efficiently Identifying Watermarked Segments in Mixed-Source Texts. 6304-6316 - Fangru Lin, Shaoguang Mao, Emanuele La Malfa, Valentin Hofmann, Adrian de Wynter, Xun Wang, Si-Qing Chen, Michael J. Wooldridge, Janet B. Pierrehumbert, Furu Wei:

Assessing Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks. 6317-6342 - Qing Wang, Yuepei Li, Qiao Qiao, Kang Zhou, Qi Li:

Towards a More Generalized Approach in Open Relation Extraction. 6343-6354 - Viktor Moskvoretskii, Maria Marina, Mikhail Salnikov, Nikolay Ivanov, Sergey Pletenev, Daria Galimzianova, Nikita Krayko, Vasily Konovalov, Irina Nikishina, Alexander Panchenko:

Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home. 6355-6384 - Seungone Kim, Juyoung Suk, Xiang Yue, Vijay Viswanathan, Seongyun Lee, Yizhong Wang, Kiril Gashteovski, Carolin Lawrence, Sean Welleck, Graham Neubig:

Evaluating Language Models as Synthetic Data Generators. 6385-6403 - Yuyao Ge, Shenghua Liu, Baolong Bi, Yiwei Wang, Lingrui Mei, Wenjie Feng, Lizhe Chen, Xueqi Cheng:

Can Graph Descriptive Order Affect Solving Graph Problems with LLMs? 6404-6420 - Wei Hao, Ran Li, Weiliang Zhao, Junfeng Yang, Chengzhi Mao:

Learning to Rewrite: Generalized LLM-Generated Text Detection. 6421-6434 - Linhao Yu, Xingguang Ji, Yahui Liu, Fanheng Kong, Chenxi Sun, Jingyuan Zhang, Hongzhi Zhang, Victoria W., Fuzheng Zhang, Deyi Xiong:

Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search. 6435-6462 - Maxim Zhelnin, Viktor Moskvoretskii, Egor Shvetsov, Mariya Krylova, Egor Venediktov, Aleksandr Zuev, Evgeny Burnaev:

GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs. 6463-6480 - Hong Huang, Dapeng Wu:

Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis. 6481-6496 - Atsuyuki Miyai, Jingkang Yang, Jingyang Zhang, Yifei Ming, Qing Yu, Go Irie, Yixuan Li, Hai Helen Li, Ziwei Liu, Kiyoharu Aizawa:

Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models. 6497-6540 - Yuhang Wu, Wenmeng Yu, Yean Cheng, Yan Wang, Xiaohan Zhang, Jiazheng Xu, Ming Ding, Yuxiao Dong:

AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models. 6541-6558 - Jillian Fisher, Shangbin Feng, Robert Aron, Thomas Richardson, Yejin Choi, Daniel W. Fisher, Jennifer Pan, Yulia Tsvetkov, Katharina Reinecke:

Biased LLMs can Influence Political Decision-Making. 6559-6607 - T. Y. S. S. Santosh, Tuan-Quang Vuong:

LexTempus: Enhancing Temporal Generalizability of Legal Language Models Through Dynamic Mixture of Experts. 6608-6624 - Soda Marem Lo, Oscar Araque, Rajesh Sharma, Marco Antonio Stranisci:

That is Unacceptable: the Moral Foundations of Canceling. 6625-6639 - Jun Yin, Pengyu Zeng, Haoyuan Sun, Yuqin Dai, Han Zheng, Miao Zhang, Yachao Zhang, Shuai Lu:

FloorPlan-LLaMa: Aligning Architects' Feedback and Domain Knowledge in Architectural Floor Plan Generation. 6640-6662 - Max Ku, Cheuk Hei Chong, Jonathan Leung, Krish Shah, Alvin Yu, Wenhu Chen:

TheoremExplainAgent: Towards Video-based Multimodal Explanations for LLM Theorem Understanding. 6663-6684 - Guizhen Chen, Weiwen Xu, Hao Zhang, Hou Pong Chan, Chaoqun Liu, Lidong Bing, Deli Zhao, Anh Tuan Luu, Yu Rong:

FineReason: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving. 6685-6715 - Sergey Berezin, Reza Farahbakhsh, Noël Crespi:

The TIP of the Iceberg: Revealing a Hidden Class of Task-in-Prompt Adversarial Attacks on LLMs. 6716-6730 - Léane Jourdan, Nicolas Hernandez, Florian Boudin, Richard Dufour:

Identifying Reliable Evaluation Metrics for Scientific Text Revision. 6731-6756 - Liwei Jiang, Taylor Sorensen, Sydney Levine, Yejin Choi:

Can Language Models Reason about Individualistic Human Values and Preferences? 6757-6794 - Dmitry Morozov, Lizaveta Astapenka, Anna V. Glazkova, Timur Garipov, Olga Lyashevskaya:

BERT-like Models for Slavic Morpheme Segmentation. 6795-6815 - Xianzhen Luo, Yixuan Wang, Qingfu Zhu, Zhiming Zhang, Xuanyu Zhang, Qing Yang, Dongliang Xu:

Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling. 6816-6831 - Xinyu Tang, Xiaolei Wang, Zhihao Lv, Yingqian Min, Xin Zhao, Binbin Hu, Ziqi Liu, Zhiqiang Zhang:

Unlocking General Long Chain-of-Thought Reasoning Capabilities of Large Language Models via Representation Engineering. 6832-6849 - Jiazheng Li, Hanqi Yan, Yulan He:

Drift: Enhancing LLM Faithfulness in Rationale Generation via Dual-Reward Probabilistic Inference. 6850-6866 - Angelina Wang, Michelle Phan, Daniel E. Ho, Sanmi Koyejo:

Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs. 6867-6893 - Shojiro Yamabe, Futa Kai Waseda, Tsubasa Takahashi, Koki Wataoka:

MergePrint: Merge-Resistant Fingerprints for Robust Black-box Ownership Verification of Large Language Models. 6894-6916 - Zeyao Ma, Xiaokang Zhang, Jing Zhang, Jifan Yu, Sijia Luo, Jie Tang:

Dynamic Scaling of Unit Tests for Code Reward Modeling. 6917-6935 - Fengran Mo, Yifan Gao, Chuan Meng, Xin Liu, Zhuofeng Wu, Kelong Mao, Zhengyang Wang, Pei Chen, Zheng Li, Xian Li, Bing Yin, Meng Jiang:

UniConv: Unifying Retrieval and Response Generation for Large Language Models in Conversations. 6936-6949 - Minghao Lv, Siyuan Chen, Haoan Jin, Minghao Yuan, Qianqian Ju, Yujia Peng, Kenny Q. Zhu, Mengyue Wu:

Tracking Life's Ups and Downs: Mining Life Events from Social Media Posts for Mental Health Analysis. 6950-6965 - Shengpeng Ji, Qian Chen, Wen Wang, Jialong Zuo, Minghui Fang, Ziyue Jiang, Hai Huang, Zehan Wang, Xize Cheng, Siqi Zheng, Zhou Zhao:

ControlSpeech: Towards Simultaneous and Independent Zero-shot Speaker Cloning and Zero-shot Language Style Control. 6966-6981 - Haoran Que, Wenge Rong:

PIC: Unlocking Long-Form Text Generation Capabilities of Large Language Models via Position ID Compression. 6982-6995 - Dasha Metropolitansky, Jonathan Larson:

Towards Effective Extraction and Evaluation of Factual Claims. 6996-7045 - Yijie Hao, Haofei Yu, Jiaxuan You:

Beyond Facts: Evaluating Intent Hallucination in Large Language Models. 7046-7069 - Yida Zhao, Hao Xve, Xiang Hu, Kewei Tu:

A Systematic Study of Compositional Syntactic Transformer Language Models. 7070-7083 - Zhaopeng Feng, Jiayuan Su, Jiamei Zheng, Jiahan Ren, Yan Zhang, Jian Wu, Hongwei Wang, Zuozhu Liu:

M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation. 7084-7107 - Shuangrui Ding, Zihan Liu, Xiaoyi Dong, Pan Zhang, Rui Qian, Junhao Huang, Conghui He, Dahua Lin, Jiaqi Wang:

SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition. 7108-7127 - Jinghao Zhang, Yuting Liu, Wenjie Wang, Qiang Liu, Shu Wu, Liang Wang, Tat-Seng Chua:

Personalized Text Generation with Contrastive Activation Steering. 7128-7141 - Siyuan Huang, Zhiyuan Ma, Jintao Du, Changhua Meng, Weiqiang Wang, Jingwen Leng, Minyi Guo, Zhouhan Lin:

Gumbel Reranking: Differentiable End-to-End Reranker Optimization. 7142-7161 - Lester James Validad Miranda, Yizhong Wang, Yanai Elazar, Sachin Kumar, Valentina Pyatkin, Faeze Brahman, Noah A. Smith, Hannaneh Hajishirzi, Pradeep Dasigi:

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback. 7162-7200 - Yi-Fan Lu, Xian-Ling Mao, Tian Lan, Tong Zhang, Yu-Shi Zhu, Heyan Huang:

SEOE: A Scalable and Reliable Semantic Evaluation Framework for Open Domain Event Detection. 7201-7218 - Angelina Aspra Aquino, Lester James Validad Miranda, Elsie Marie T. Or:

The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project. 7219-7239 - Jennifer Chen, Aidar Myrzakhan, Yaxin Luo, Hassaan Muhammad Khan, Sondos Mahmoud Bsharat, Zhiqiang Shen:

DRAG: Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillation. 7240-7260 - Shilong Wang, Guibin Zhang, Miao Yu, Guancheng Wan, Fanci Meng, Chongye Guo, Kun Wang, Yang Wang:

G-Safeguard: A Topology-Guided Security Lens and Treatment on LLM-based Multi-agent Systems. 7261-7276 - Bumjin Park, Leejinsil Leejinsil, Jaesik Choi:

Deontological Keyword Bias: The Impact of Modal Expressions on Normative Judgments of Language Models. 7277-7296 - Weijie Shi, Han Zhu, Jiaming Ji, Mengze Li, Jipeng Zhang, Ruiyuan Zhang, Jia Zhu, Jiajie Xu, Sirui Han, Yike Guo:

LegalReasoner: Step-wised Verification-Correction for Legal Judgment Reasoning. 7297-7313 - Maggie Mi, Aline Villavicencio, Nafise Sadat Moosavi:

Rolling the DICE on Idiomaticity: How LLMs Fail to Grasp Context. 7314-7332 - Xuanle Zhao, Xianzhen Luo, Qi Shi, Chi Chen, Shuo Wang, Zhiyuan Liu, Maosong Sun:

ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation. 7333-7348 - Nina Gregorio, Matteo Gay, Sharon Goldwater, Edoardo M. Ponti:

The Cross-linguistic Role of Animacy in Grammar Structures. 7349-7363 - Ayush Maheshwari, Atul Kumar Singh, N. J. Karthika, Krishnakant Bhatt, Preethi Jyothi, Ganesh Ramakrishnan:

LexGen: Domain-aware Multilingual Lexicon Generation. 7364-7375 - Tianyu Gao, Alexander Wettig, Howard Yen, Danqi Chen:

How to Train Long-Context Language Models (Effectively). 7376-7399 - Qizhi Pei, Lijun Wu, Zhuoshi Pan, Yu Li, Honglin Lin, Chenlin Ming, Xin Gao, Conghui He, Rui Yan:

MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion. 7400-7420 - Ramon Ruiz-Dolz, Zlata Kikteva, John Lawrence:

Mining Complex Patterns of Argumentative Reasoning in Natural Language Dialogue. 7421-7435 - Xueyu Hu, Tao Xiong, Biao Yi, Zishu Wei, Ruixuan Xiao, Yurun Chen, Jiasheng Ye, Meiling Tao, Xiangxin Zhou, Ziyu Zhao, Yuhuai Li, Shengze Xu, Shenzhi Wang, Xinchen Xu, Shuofei Qiao, Zhaokai Wang, Kun Kuang, Tieyong Zeng, Liang Wang, Jiwei Li, Yuchen Eleanor Jiang, Wangchunshu Zhou, Guoyin Wang, Keting Yin, Zhou Zhao, Hongxia Yang, Fan Wu, Shengyu Zhang, Fei Wu:

OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use. 7436-7465 - Mingfei Lau, Qian Chen, Yeming Fang, Tingting Xu, Tongzhou Chen, Pavel Golik:

Data Quality Issues in Multilingual Speech Datasets: The Need for Sociolinguistic Awareness and Proactive Language Planning. 7466-7492 - Amr Mohamed, Mingmeng Geng, Michalis Vazirgiannis, Guokan Shang:

LLM as a Broken Telephone: Iterative Generation Distorts Information. 7493-7509 - Jianshu Zhang, Dongyu Yao, Renjie Pi, Paul Pu Liang, Yi R. Fung:

VLM2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues. 7510-7545 - Xiang Geng, Zhejian Lai, Jiajun Chen, Hao Yang, Shujian Huang:

Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation. 7546-7560 - Fan Zhang, Shulin Tian, Ziqi Huang, Yu Qiao, Ziwei Liu:

Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models. 7561-7582 - Zongxia Li, Lorena Calvo-Bartolomé, Alexander Miserlis Hoyle, Paiheng Xu, Daniel Kofi Stephens, Juan Francisco Fung, Alden Dima, Jordan Lee Boyd-Graber:

Large Language Models Struggle to Describe the Haystack without Human Help: A Social Science-Inspired Evaluation of Topic Models. 7583-7604 - Ziyue Wang, Chi Chen, Fuwen Luo, Yurui Dong, Yuanchi Zhang, Yuzhuang Xu, Xiaolong Wang, Peng Li, Yang Liu:

ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models. 7605-7633 - Ritwik Gupta, Rodolfo Corona, Jiaxin Ge, Eric Wang, Dan Klein, Trevor Darrell, David M. Chan:

Enough Coin Flips Can Make LLMs Act Bayesian. 7634-7655 - Wenye Lin, Jonathan Roberts, Yunhan Yang, Samuel Albanie, Zongqing Lu, Kai Han:

GAMEBoT: Transparent Assessment of LLM Reasoning in Games. 7656-7682 - Zhijie Nie, Richong Zhang, Zhanyu Wu:

A Text is Worth Several Tokens: Text Embedding from LLMs Secretly Aligns Well with The Key Tokens. 7683-7694 - Abdelrahman Boda Sadallah, Junior Cedric Tonga, Khalid Almubarak, Saeed Almheiri, Farah Atif, Chatrine Qwaider, Karima Kadaoui, Sara Shatnawi, Yaser Alesh, Fajri Koto:

Commonsense Reasoning in Arab Culture. 7695-7710 - Junting Lu, Zhiyang Zhang, Fangkai Yang, Jue Zhang, Lu Wang, Chao Du, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang:

AXIS: Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents. 7711-7743 - Yang Chen, Vedaant Shah, Alan Ritter:

Translation and Fusion Improves Cross-lingual Information Extraction. 7744-7764 - Shaobo Cui, Wenqing Liu, Yiyang Feng, Jiawei Zhou, Boi Faltings:

Conditional Dichotomy Quantification via Geometric Embedding. 7765-7791 - Zhaoxuan Tan, Zheng Li, Tianyi Liu, Haodong Wang, Hyokun Yun, Ming Zeng, Pei Chen, Zhihan Zhang, Yifan Gao, Ruijie Wang, Priyanka Nigam, Bing Yin, Meng Jiang:

Aligning Large Language Models with Implicit Preferences from User-Generated Content. 7792-7820 - Yuyan Chen, Jiyuan Jia, Jiaxin Lu, Siyue Li, Yu Guan, Ming Yang, Qingpei Guo:

VQAGuider: Guiding Multimodal Large Language Models to Answer Complex Video Questions. 7821-7834 - Fang Wu, Vijay Prakash Dwivedi, Jure Leskovec:

Large Language Models are Good Relational Learners. 7835-7854 - Michael Ogezi, Freda Shi:

SpaRE: Enhancing Spatial Reasoning in Vision-Language Models with Synthetic Data. 7855-7875 - William Barr Held, Yanzhe Zhang, Weiyan Shi, Minzhi Li, Michael J. Ryan, Diyi Yang:

Distilling an End-to-End Voice Assistant Without Instruction Training Data. 7876-7891 - Shuhang Xu, Fangwei Zhong:

CoMet: Metaphor-Driven Covert Communication for Multi-Agent Language Games. 7892-7917 - Ali Razghandi, Seyed Mohammad Hadi Hosseini, Mahdieh Soleymani Baghshah:

CER: Confidence Enhanced Reasoning in LLMs. 7918-7938 - Minjia Mao, Dongjun Wei, Zeyu Chen, Xiao Fang, Michael Chau:

Watermarking Large Language Models: An Unbiased and Low-risk Method. 7939-7960 - Haoyang Wen, Jiang Guo, Yi Zhang, Jiarong Jiang, Zhiguo Wang:

On Synthetic Data Strategies for Domain-Specific Generative Retrieval. 7961-7976 - Ying Shen, Lifu Huang:

LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates. 7977-7992 - Tamer Alkhouli, Katerina Margatina, James Gung, Raphael Shu, Claudia Zaghi, Monica Sunkara, Yi Zhang:

CONFETTI: Conversational Function-Calling Evaluation Through Turn-Level Interactions. 7993-8006 - Anthony B. Sicilia, Malihe Alikhani:

Evaluating Theory of (an uncertain) Mind: Predicting the Uncertain Beliefs of Others from Conversational Cues. 8007-8021 - Shaobo Cui, Luca Mouchel, Boi Faltings:

Uncertainty in Causality: A New Frontier. 8022-8044 - Michael J. Ryan, Omar Shaikh, Aditri Bhagirath, Daniel Frees, William Barr Held, Diyi Yang:

SynthesizeMe! Inducing Persona-Guided Prompts for Personalized Reward Models in LLMs. 8045-8078 - Julia Mendelsohn, Ceren Budak:

When People are Floods: Analyzing Dehumanizing Metaphors in Immigration Discourse with Large Language Models. 8079-8103 - Weidi Luo, Shenghong Dai, Xiaogeng Liu, Suman Banerjee, Huan Sun, Muhao Chen, Chaowei Xiao:

AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection. 8104-8139 - Yiqing Xie, Wenxuan Zhou, Pradyot Prakash, Di Jin, Yuning Mao, Quintin Fettes, Arya Talebzadeh, Sinong Wang, Han Fang, Carolyn P. Rosé, Daniel Fried, Hejia Zhang:

Improving Model Factuality with Fine-grained Critique-based Evaluator. 8140-8155 - Florencia Marotta-Wurgler, David Stein:

Building a Long Text Privacy Policy Corpus with Multi-Class Labels. 8156-8219 - Leonardo Ranaldi, Federico Ranaldi, Giulia Pucci:

R2-MultiOmnia: Leading Multilingual Multimodal Reasoning via Self-Training. 8220-8234 - Samuel Joseph Amouyal, Aya Meltzer-Asscher, Jonathan Berant:

When the LM misunderstood the human chuckled: Analyzing garden path effects in humans and language models. 8235-8253 - Zixiang Xu, Yanbo Wang, Yue Huang, Xiuying Chen, Jieyu Zhao, Meng Jiang, Xiangliang Zhang:

Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models. 8254-8284 - Xuhao Hu, Dongrui Liu, Hao Li, Xuanjing Huang, Jing Shao:

VLSBench: Unveiling Visual Leakage in Multimodal Safety. 8285-8316 - Sky CH-Wang, Darshan Girish Deshpande, Smaranda Muresan, Anand Kannappan, Rebecca Qian:

Browsing Lost Unformed Recollections: A Benchmark for Tip-of-the-Tongue Search and Reasoning. 8317-8331 - Jonibek Mansurov, Akhmed Sakip, Alham Fikri Aji:

Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation. 8332-8345 - Francesco Corso, Francesco Pierri, Gianmarco De Francisci Morales:

Conspiracy Theories and Where to Find Them on TikTok. 8346-8362 - Chunhui Zhang, Sirui Wang, Zhongyu Ouyang, Xiangchi Yuan, Soroush Vosoughi:

Growing Through Experience: Scaling Episodic Grounding in Language Models. 8363-8375 - Yuan Zhou, Zhuo Zhang, Xiangyu Zhang:

Exploiting the Shadows: Unveiling Privacy Leaks through Lower-Ranked Tokens in Large Language Models. 8376-8386 - Yanzhe Zhang, Tao Yu, Diyi Yang:

Attacking Vision-Language Computer Agents via Pop-ups. 8387-8401 - Congbo Ma, Yuxia Wang, Jia Wu, Jian Yang, Jing Du, Zitai Qiu, Qing Li, Hu Wang, Preslav Nakov:

Explicit and Implicit Data Augmentation for Social Event Detection. 8402-8415 - Zhen Tan, Jun Yan, I-Hung Hsu, Rujun Han, Zifeng Wang, Long T. Le, Yiwen Song, Yanfei Chen, Hamid Palangi, George Lee, Anand Rajan Iyer, Tianlong Chen, Huan Liu, Chen-Yu Lee, Tomas Pfister:

In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents. 8416-8439 - Xiaoyi Bao, Zhongqing Wang, Jinghang Gu, Chu-Ren Huang:

Revisiting Classical Chinese Event Extraction with Ancient Literature Information. 8440-8451 - Xiangyu Peng, Prafulla Kumar Choubey, Caiming Xiong, Chien-Sheng Wu:

Unanswerability Evaluation for Retrieval Augmented Generation. 8452-8472 - Chengshuai Zhao, Zhen Tan, Chau-Wai Wong, Xinyan Zhao, Tianlong Chen, Huan Liu:

SCALE: Towards Collaborative Content Analysis in Social Science with Large Language Model Agents and Human Intervention. 8473-8503 - Erxin Yu, Jing Li, Ming Liao, Qi Zhu, Boyang Xue, Minghui Xu, Baojun Wang, Lanqing Hong, Fei Mi, Lifeng Shang:

Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning. 8504-8519 - Kunlun Zhu, Yifan Luo, Dingling Xu, Yukun Yan, Zhenghao Liu, Shi Yu, Ruobing Wang, Shuo Wang, Yishan Li, Nan Zhang, Xu Han, Zhiyuan Liu, Maosong Sun:

RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework. 8520-8544 - Homaira Huda Shomee, Zhu Wang, Sathya N. Ravi, Sourav Medya:

A Survey on Patent Analysis: From NLP to Multimodal AI. 8545-8561 - Chengye Wang, Yifei Shen, Zexi Kuang, Arman Cohan, Yilun Zhao:

SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification. 8562-8579 - Kunlun Zhu, Hongyi Du, Zhaochen Hong, Xiaocheng Yang, Shuyi Guo, Zhe Wang, Zhenhailong Wang, Cheng Qian, Robert Tang, Heng Ji, Jiaxuan You:

MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents. 8580-8622 - Tharindu Ranasinghe, Hansi Hettiarachchi, Nadeesha Chathurangi Naradde Vidana Pathirana, Damith Premasiri, Lasitha Uyangodage, Isuri Anuradha Nanomi Arachchige, Alistair Plum, Paul Rayson, Ruslan Mitkov:

Sinhala Encoder-only Language Models and Evaluation. 8623-8636 - Zhengxiang Wang, Veronika Makarova, Zhi Li, Jordan Kodner, Owen Rambow:

LLMs can Perform Multi-Dimensional Analytic Writing Assessments: A Case Study of L2 Graduate-Level Academic English Writing. 8637-8663 - Haomin Zhuang, Yihua Zhang, Kehan Guo, Jinghan Jia, Gaowen Liu, Sijia Liu, Xiangliang Zhang:

SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs? 8664-8678 - Bolei Ma, Yuting Li, Wei Zhou, Ziwei Gong, Yang Janet Liu, Katja Jasinskaja, Annemarie Friedrich, Julia Hirschberg, Frauke Kreuter, Barbara Plank:

Pragmatics in the Era of Large Language Models: A Survey on Datasets, Evaluation, Opportunities and Challenges. 8679-8696 - Zhaoling Chen, Robert Tang, Gangda Deng, Fang Wu, Jialong Wu, Zhiwei Jiang, Viktor K. Prasanna, Arman Cohan, Xingyao Wang:

LocAgent: Graph-Guided LLM Agents for Code Localization. 8697-8727 - Raghvendra Kumar, Mohammed Salman S. A, Aryan Sahu, Tridib Nandi, Pragathi Y. P., Sriparna Saha, José G. Moreno:

COSMMIC: Comment-Sensitive Multimodal Multilingual Indian Corpus for Summarization and Headline Generation. 8728-8748 - Minzhi Li, William Barr Held, Michael J. Ryan, Kunat Pipatanakul, Potsawee Manakul, Hao Zhu, Diyi Yang:

Mind the Gap: Static and Interactive Evaluations of Large Audio Models. 8749-8766 - Renhao Pei, Yihong Liu, Peiqin Lin, François Yvon, Hinrich Schütze:

Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu. 8767-8788 - Jizhan Fang, Tianhe Lu

, Yunzhi Yao, Ziyan Jiang, Xin Xu
, Huajun Chen, Ningyu Zhang:
CKnowEdit: A New Chinese Knowledge Editing Dataset for Linguistics, Facts, and Logic Error Correction in LLMs. 8789-8807 - Cheng Xu, Nan Yan:

TripleFact: Defending Data Contamination in the Evaluation of LLM-driven Fake News Detection. 8808-8823 - Xiaomeng Zhu, Zhenghao Zhou, Simon Charlow, Robert Frank:

Meaning Beyond Truth Conditions: Evaluating Discourse Level Understanding via Anaphora Accessibility. 8824-8842 - Irtaza Khalid, Amir Masoud Nourollah, Steven Schockaert:

Large Language and Reasoning Models are Shallow Disjunctive Reasoners. 8843-8869 - Senyu Li, Zipeng Sun, Jiayi Wang, Xue Liu, Pontus Stenetorp, Siva Reddy, David Ifeoluwa Adelani:

Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation. 8870-8880 - Nedjma Ousidhoum, Meriem Beloucif, Saif M. Mohammad:

Building Better: Avoiding Pitfalls in Developing Language Resources when Data is Scarce. 8881-8894 - Shamsuddeen Hassan Muhammad, Nedjma Ousidhoum, Idris Abdulmumin, Jan Philip Wahle, Terry Ruas, Meriem Beloucif, Christine de Kock, Nirmal Surange, Daniela Teodorescu, Ibrahim Said Ahmad, David Ifeoluwa Adelani, Alham Fikri Aji, Felermino D. M. A. Ali, Ilseyar Alimova, Vladimir Araujo, Nikolay Babakov, Naomi Baes, Ana-Maria Bucur, Andiswa Bukula, Guanqun Cao, Rodrigo Tufino Cardenas, Rendi Chevi, Chiamaka Ijeoma Chukwuneke, Alexandra Ciobotaru, Daryna Dementieva, Murja Sani Gadanya, Robert Geislinger, Bela Gipp, Oumaima Hourrane, Oana Ignat, Falalu Ibrahim Lawan, Rooweither Mabuya, Rahmad Mahendra, Vukosi Marivate, Alexander Panchenko, Andrew Piper, Charles Henrique Porto Ferreira, Vitaly Protasov, Samuel Rutunda, Manish Shrivastava, Aura Cristina Udrea, Lilian Diana Awuor Wanzare, Sophie Wu, Florian Valentin Wunderlich, Hanif Muhammad Zhafran, Tianhui Zhang, Yi Zhou, Saif M. Mohammad:

BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages. 8895-8916 - Yufei Tian, Jiao Sun, Nanyun Peng, Zizhao Zhang:

SkillVerse : Assessing and Enhancing LLMs with Tree Evaluation. 8917-8933 - Yanlin Feng, Simone Papicchio, Sajjadur Rahman:

CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge Graphs in the LLM Era. 8934-8958 - Francine Chen, Scott A. Carter, Tatiana Lau, Nayeli Suseth Bravo, Sumanta Bhattacharyya, Kate A. Sieck, Charlene C. Wu:

Empathy Prediction from Diverse Perspectives. 8959-8974 - Federico Ravenda, Seyed Ali Bahrainian, Andrea Raballo, Antonietta Mira, Noriko Kando:

Are LLMs effective psychological assessors? Leveraging adaptive RAG for interpretable mental health screening through psychometric practice. 8975-8991 - Aum Kendapadi, Kerem Zaman, Rakesh R. Menon, Shashank Srivastava:

INTERACT: Enabling Interactive, Question-Driven Learning in Large Language Models. 8992-9024 - Alan Sun:

Circuit Stability Characterizes Language Model Generalization. 9025-9040 - Olga Zamaraeva, Dan Flickinger, Francis Bond, Carlos Gómez-Rodríguez:

Comparing LLM-generated and human-authored news text using formal syntactic theory. 9041-9060 - Sharan Maiya, Yinhong Liu, Ramit Debnath, Anna Korhonen:

Improving Preference Extraction In LLMs By Identifying Latent Knowledge Through Classifying Probes. 9061-9081 - Yixin Wan, Kai-Wei Chang:

White Men Lead, Black Women Help? Benchmarking and Mitigating Language Agency Social Biases in LLMs. 9082-9108 - Adriana Eufrosina Bora, Akshatha Arodi, Duoyi Zhang, Jordan Bannister, Mirko Bronzi, Arsène Fansi Tchango, Md. Abul Bashar, Richi Nayak, Kerrie L. Mengersen:

AIMSCheck: Leveraging LLMs for AI-Assisted Review of Modern Slavery Statements Across Jurisdictions. 9109-9135 - Mohsen Fayyaz, Ali Modarressi, Hinrich Schütze, Nanyun Peng:

Collapse of Dense Retrievers: Short, Early, and Literal Biases Outranking Factual Evidence. 9136-9152 - Zhining Liu, Rana Ali Amjad, Ravinarayana Adkathimar, Tianxin Wei, Hanghang Tong:

SelfElicit: Your Language Model Secretly Knows Where is the Relevant Evidence. 9153-9173 - Yixin Wan, Kai-Wei Chang:

The Male CEO and the Female Assistant: Evaluation and Mitigation of Gender Biases in Text-To-Image Generation of Dual Subjects. 9174-9190 - Michalis Korakakis, Andreas Vlachos, Adrian Weller:

Mitigating Shortcut Learning with InterpoLated Learning. 9191-9206 - Theron S. Wang, Xingyuan Li, Hridayesh Lekhak, Tuan Minh Dang, Mengyue Wu, Kenny Q. Zhu:

Toward Automatic Discovery of a Canine Phonetic Alphabet. 9207-9219 - Haotian Zhou, Tingkai Liu, Qianli Ma, Yufeng Zhang, Jianbo Yuan, Pengfei Liu, Yang You, Hongxia Yang:

DavIR: Data Selection via Implicit Reward for Large Language Models. 9220-9237 - Artidoro Pagnoni, Ramakanth Pasunuru, Pedro Rodríguez, John Nguyen, Benjamin Muller, Margaret Li, Chunting Zhou, Lili Yu, Jason E. Weston, Luke Zettlemoyer, Gargi Ghosh, Mike Lewis, Ari Holtzman, Srini Iyer:

Byte Latent Transformer: Patches Scale Better Than Tokens. 9238-9258 - Zhenhao Li, Huichi Zhou, Marek Rei, Lucia Specia:

DiffuseDef: Improved Robustness to Adversarial Attacks via Iterative Denoising. 9259-9274 - Huanhuan Wei, Xiao Luo, Hongyi Yu, Jinping Liang, Luning Yang, Lixing Lin, Alexandra Popa, Xiting Yan:

Identifying Cellular Niches in Spatial Transcriptomics: An Investigation into the Capabilities of Large Language Models. 9275-9289 - Zahra Bokaei, Walid Magdy, Bonnie Webber:

Culture Matters in Toxic Language Detection in Persian. 9290-9304 - Jinheng Wang, Hansong Zhou, Ting Song, Shijie Cao, Yan Xia, Ting Cao, Jianyu Wei, Shuming Ma, Hongyu Wang, Furu Wei:

Bitnet.cpp: Efficient Edge Inference for Ternary LLMs. 9305-9322 - Guilherme Fonseca, Washington Cunha, Gabriel Prenassi, Marcos André Gonçalves, Leonardo Chaves Dutra da Rocha:

Instance-Selection-Inspired Undersampling Strategies for Bias Reduction in Small and Large Language Models for Binary Text Classification. 9323-9340 - Yeachan Kim, SangKeun Lee:

Forward Knows Efficient Backward Path: Saliency-Guided Memory-Efficient Fine-tuning of Large Language Models. 9341-9356 - Aofei Chang, Le Huang, Alex James Boyd, Parminder Bhatia, Taha A. Kass-Hout, Cao Xiao, Fenglong Ma:

Focus on What Matters: Enhancing Medical Vision-Language Models with Automatic Attention Alignment Tuning. 9357-9372 - Jiongnan Liu, Yutao Zhu, Shuting Wang, Xiaochi Wei, Erxue Min, Yu Lu, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou:

LLMs + Persona-Plug = Personalized LLMs. 9373-9385 - Masato Mita, Ryo Yoshida, Yohei Oseki:

Developmentally-plausible Working Memory Shapes a Critical Period for Language Acquisition. 9386-9399 - Tao Feng, Lizhen Qu, Niket Tandon, Gholamreza Haffari:

IRIS: An Iterative and Integrated Framework for Verifiable Causal Discovery in the Absence of Tabular Data. 9400-9428 - Hao Yu, Jesujoba Oluwadara Alabi, Andiswa Bukula, Jian Yun Zhuang, En-Shiun Annie Lee, Tadesse Kebede Guge, Israel Abebe Azime, Happy Buzaaba, Blessing Kudzaishe Sibanda, Godson Koffi Kalipe, Jonathan Mukiibi, Salomon Kabongo Kabenamualu, Mmasibidi Setaka, Lolwethu Ndolela, Nkiruka Odu, Rooweither Mabuya, Shamsuddeen Hassan Muhammad, Salomey Osei, Sokhar Samb, Dietrich Klakow, David Ifeoluwa Adelani:

INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages. 9429-9452 - Hongjin Qian, Zheng Liu, Peitian Zhang, Zhicheng Dou, Defu Lian:

Boosting Long-Context Information Seeking via Query-Guided Activation Refilling. 9453-9464 - Tianyi Bai, Ling Yang, Zhen Hao Wong, Fupeng Sun, Xinlin Zhuang, Jiahui Peng, Chi Zhang, Lijun Wu, Jiantao Qiu, Wentao Zhang, Binhang Yuan, Conghui He:

Efficient Pretraining Data Selection for Language Models via Multi-Actor Collaboration. 9465-9491 - Han Liu, Changya Li, Xiaotong Zhang, Feng Zhang, Fenglong Ma, Wei Wang, Hong Yu:

AdaDHP: Fine-Grained Fine-Tuning via Dual Hadamard Product and Adaptive Parameter Selection. 9492-9504 - Jinhao Jiang, Kun Zhou, Xin Zhao, Yang Song, Chen Zhu, Hengshu Zhu, Ji-Rong Wen:

KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph. 9505-9523 - Mingyu Lee, Yeachan Kim, Wing-Lam Mok, SangKeun Lee:

Curriculum Debiasing: Toward Robust Parameter-Efficient Fine-Tuning Against Dataset Biases. 9524-9540 - Austin Xu, Srijan Bansal, Yifei Ming, Semih Yavuz, Shafiq Joty:

Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings. 9541-9564 - Tao Feng, Lizhen Qu, Niket Tandon, Zhuang Li, Xiaoxi Kang, Gholamreza Haffari:

On the Reliability of Large Language Models for Causal Discovery. 9565-9590 - Jingxuan Li, Yuning Yang, Shengqi Yang, Linfan Zhang, Ying Nian Wu:

Value-Spectrum: Quantifying Preferences of Vision-Language Models via Value Decomposition in Social Media Contexts. 9591-9610 - Ziyang Liu, Chaokun Wang:

TeRDy: Temporal Relation Dynamics through Frequency Decomposition for Temporal Knowledge Graph Completion. 9611-9622 - Yerim Oh, Jun-Hyung Park, Junho Kim, SungHo Kim, SangKeun Lee:

Incorporating Domain Knowledge into Materials Tokenization. 9623-9644 - Yidan Wang, Yanan Cao, Yubing Ren, Fang Fang, Zheng Lin, Binxing Fang:

PIG: Privacy Jailbreak Attack on LLMs via Gradient-based Iterative In-Context Optimization. 9645-9660 - Rana Muhammad Shahroz, Zhen Tan, Sukwon Yun, Charles Fleming, Tianlong Chen:

Agents Under Siege: Breaking Pragmatic Multi-Agent LLM Systems with Optimized Prompt Attacks. 9661-9674 - Shusheng Li, Jiale Li, Yifei Qu, Xinwei Shi, Yanliang Guo, Ziyi He, Yubo Wang, Wenjun Tan:

Semantic-Eval : A Semantic Comprehension Evaluation Framework for Large Language Models Generation without Training. 9675-9690 - Michael Y. Hu, Jackson Petty, Chuan Shi, William Merrill, Tal Linzen:

Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases. 9691-9709 - Hyuhng Joon Kim, Youna Kim, Sang-goo Lee, Taeuk Kim:

When to Speak, When to Abstain: Contrastive Decoding with Abstention. 9710-9730 - Herun Wan, Minnan Luo, Zhixiong Su, Guang Dai, Xiang Zhao:

On the Risk of Evidence Pollution for Malicious Social Text Detection in the Era of LLMs. 9731-9761 - Lei Wang, Zheqing Zhang, Xu Chen:

Investigating and Extending Homans' Social Exchange Theory with Large Language Model based Agents. 9762-9777 - Jiesong Liu, Brian Park, Xipeng Shen:

A Drop-In Solution for On-the-Fly Adaptation of Speculative Decoding in Large Language Models. 9778-9794 - Ryo Yoshida, Shinnosuke Isono, Kohei Kajikawa, Taiga Someya, Yushi Sugimoto, Yohei Oseki:

If Attention Serves as a Cognitive Model of Human Memory Retrieval, What is the Plausible Memory Representation? 9795-9812 - Yongqi Li, Shen Zhou, Xiaohu Li, Xin Miao, Jintao Wen, Mayi Xu, Jianhao Chen, Birong Pan, Hankun Kang, Yuanyuan Zhu, Ming Zhong, Tieyun Qian:

Aligning VLM Assistants with Personalized Situated Cognition. 9813-9839 - Zhisong Zhang, Yan Wang, Xinting Huang, Tianqing Fang, Hongming Zhang, Chenlong Deng, Shuaiyi Li, Dong Yu:

Attention Entropy is a Key Factor: An Analysis of Parallel Context Encoding with Full-attention-based Pre-trained Language Models. 9840-9855 - Huanran Zheng, Xiaoling Wang:

Faster Speculative Decoding via Effective Draft Decoder with Pruned Candidate Tree. 9856-9868 - Zhuojun Ding, Wei Wei, Chenghao Fan:

Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models. 9869-9886 - Tao Wu, Jingyuan Chen, Wang Lin, Mengze Li, Yumeng Zhu, Ang Li, Kun Kuang, Fei Wu:

Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents. 9887-9908 - Jiali Chen, Xusen Hei, Hongfei Liu, Yuancheng Wei, Zikun Deng, Jiayuan Xie, Yi Cai, Qing Li:

CADReview: Automatically Reviewing CAD Programs with Error Detection and Correction. 9909-9927 - Junyi Li, Hwee Tou Ng:

Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling. 9928-9942 - Dana R. Alsagheer, Abdulrahman Kamal, Mohammad Kamal, Cosmo Yang Wu, Weidong Shi:

The Lawyer That Never Thinks: Consistency and Fairness as Keys to Reliable AI. 9943-9954 - SungHo Kim, Nayeon Kim, Taehee Jeon, SangKeun Lee:

Polishing Every Facet of the GEM: Testing Linguistic Competence of LLMs and Humans in Korean. 9955-9984 - Wen Huang, Yanmei Gu, Zhiming Wang, Huijia Zhu, Yanmin Qian:

SpeechFake: A Large-Scale Multilingual Speech Deepfake Dataset Incorporating Cutting-Edge Generation Methods. 9985-9998 - Houxing Ren, Mingjie Zhan, Zhongyuan Wu, Aojun Zhou, Junting Pan, Hongsheng Li:

ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation. 9999-10020 - Huisheng Wang, Zhuoshi Pan, Hangjing Zhang, Mingxiao Liu, Hanqing Gao, H. Vicky Zhao:

InvestAlign: Overcoming Data Scarcity in Aligning Large Language Models with Investor Decision-Making Processes Under Herd Behavior. 10021-10052 - Abudurexiti Reheman, Hongyu Liu, Junhao Ruan, Abudukeyumu Abudula, Yingfeng Luo, Tong Xiao, JingBo Zhu:

Enhancing Neural Machine Translation Through Target Language Data: A kNN-LM Approach for Domain Adaptation. 10053-10065 - Fuwei Zhang, Xiaoyu Liu, Xinyu Jia, Yingfei Zhang, Shuai Zhang, Xiang Li, Fuzhen Zhuang, Wei Lin, Zhao Zhang:

Multi-level Relevance Document Identifier Learning for Generative Retrieval. 10066-10080 - Mengzhao Chen, Wenqi Shao, Peng Xu, Jiahao Wang, Peng Gao, Kaipeng Zhang, Ping Luo:

EfficientQAT: Efficient Quantization-Aware Training for Large Language Models. 10081-10100 - Siting Li, Pang Wei Koh, Simon Shaolei Du:

Exploring How Generative MLLMs Perceive More Than CLIP with the Same Vision Encoder. 10101-10119 - Hyuntak Kim, Byung-Hak Kim:

NexusSum: Hierarchical LLM Agents for Long-Form Narrative Summarization. 10120-10157 - Xiao Wang, Jingyun Hua, Weihong Lin, Yuanxing Zhang, Fuzheng Zhang, Jianlong Wu, Di Zhang, Liqiang Nie:

HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models. 10158-10181 - Yanhao Jia, Xinyi Wu, Li Hao, Qinglin Zhang, Yuxiao Hu, Shuai Zhao, Wenqi Fan:

Uni-Retrieval: A Multi-Style Retrieval Framework for STEM's Education. 10182-10197 - Lin Mu, Xiaoyu Wang, Li Ni, Yang Li, Zhize Wu, Peiquan Jin, Yiwen Zhang:

DenseLoRA: Dense Low-Rank Adaptation of Large Language Models. 10198-10211 - Jisoo Mok, Ik-hwan Kim, Sangkwon Park, Sungroh Yoon:

Exploring the Potential of LLMs as Personalized Assistants: Dataset, Evaluation, and Analysis. 10212-10239 - Yuheng Chen, Pengfei Cao, Yubo Chen, Yining Wang, Shengping Liu, Kang Liu, Jun Zhao:

Cracking Factual Knowledge: A Comprehensive Analysis of Degenerate Knowledge Neurons in Large Language Models. 10240-10261 - Shenglai Zeng, Pengfei He, Kai Guo, Tianqi Zheng, Hanqing Lu, Yue Xing, Hui Liu:

Towards Context-Robust LLMs: A Gated Representation Fine-tuning Approach. 10262-10276 - Yuqian Li, Yupei Du, Yufang Liu, Feifei Feng, Mou Xiao Feng, Yuanbin Wu:

On Support Samples of Next Word Prediction. 10277-10289 - Jialong Wu, Wenbiao Yin, Yong Jiang, Zhenglin Wang, Zekun Xi, Runnan Fang, Linhai Zhang, Yulan He, Deyu Zhou, Pengjun Xie, Fei Huang:

WebWalker: Benchmarking LLMs in Web Traversal. 10290-10305 - Yidan Wang, Yubing Ren, Yanan Cao, Binxing Fang:

From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models. 10306-10322 - Hongxin Li, Jingfan Chen, Jingran Su, Yuntao Chen, Qing Li, Zhaoxiang Zhang:

AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs. 10323-10358 - Jingwen Sun, Zhiyi Tian, Yu He, Jingwei Sun, Guangzhong Sun:

Introducing Graph Context into Language Models through Parameter-Efficient Fine-Tuning for Lexical Relation Mining. 10359-10374 - Zhirui Zeng, Jiamou Liu, Meng-Fen Chiang, Jialing He, Zijian Zhang:

S-RAG: A Novel Audit Framework for Detecting Unauthorized Use of Personal Data in RAG Systems. 10375-10385 - Yongqi Leng, Renren Jin, Yue Chen, Zhuowen Han, Ling Shi, Jianxiang Peng, Lei Yang, Juesi Xiao, Deyi Xiong:

Praetor: A Fine-Grained Generative LLM Evaluator with Instance-Level Customizable Evaluation Criteria. 10386-10418 - Zhecheng Sheng, Xiruo Ding, Brian Hur, Changye Li, Trevor Cohen, Serguei V. S. Pakhomov:

Mitigating Confounding in Speech-Based Dementia Detection through Weight Masking. 10419-10434 - Yang Liu, Jiahuan Cao, Hiuyi Cheng, Yongxin Shi, Kai Ding, Lianwen Jin:

MCS-Bench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in Chinese Classical Studies. 10435-10492 - Yuheng Chen, Pengfei Cao, Kang Liu, Jun Zhao:

The Knowledge Microscope: Features as Better Analytical Lenses than Neurons. 10493-10515 - Chiwei Zhu, Benfeng Xu, Xiaorui Wang, Zhendong Mao:

From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding. 10516-10543 - Haoran Li, Wenbin Hu, Huihao Jing, Yulin Chen, Qi Hu, Sirui Han, Tianshu Chu, Peizhao Hu, Yangqiu Song:

PrivaCI-Bench: Evaluating Privacy with Contextual Integrity and Legal Compliance. 10544-10559 - Yanran Wu, Inez Hua, Yi Ding:

Unveiling Environmental Impacts of Large Language Model Serving: A Functional Unit View. 10560-10576 - Jinglong Gao, Xiao Ding, Lingxiao Zou, Bibo Cai, Bing Qin, Ting Liu:

ExpeTrans: LLMs Are Experiential Transfer Learners. 10577-10616 - Cong Liu, Xiaojun Quan, Yan Pan, Weigang Wu, Xu Chen, Liang Lin:

Cool-Fusion: Fuse Large Language Models without Training. 10617-10627 - Chuanyang Zheng, Yihang Gao, Han Shi, Jing Xiong, Jiankai Sun, Jingyao Li, Minbin Huang, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li:

DAPE V2: Process Attention Score as Feature Map for Length Extrapolation. 10628-10666 - Hui Huang, Jiaheng Liu, Yancheng He, Shilong Li, Bing Xu, Conghui Zhu, Muyun Yang, Tiejun Zhao:

MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training. 10667-10686 - Zican Dong, Junyi Li, Jinhao Jiang, Mingyu Xu, Xin Zhao, Bingning Wang, Weipeng Chen:

LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation. 10687-10707 - Yuxiang Huang, Mingye Li, Xu Han, Chaojun Xiao, Weilin Zhao, Sun Ao, Hao Zhou, Jie Zhou, Zhiyuan Liu, Maosong Sun:

APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs. 10708-10727 - Yiyang Zhang, Nan Chen:

PPT: A Minor Language News Recommendation Model via Cross-Lingual Preference Pattern Transfer. 10728-10745 - Yi Jiang, Sendong Zhao, Jianbo Li, Haochun Wang, Bing Qin:

GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis. 10746-10757 - Chenxia Tang, Jianchun Liu, Hongli Xu, Liusheng Huang:

Top-nσ: Eliminating Noise in Logit Space for Robust Token Sampling of LLM. 10758-10774 - Jialong Wu, Zhenglin Wang, Linhai Zhang, Yilong Lai, Yulan He, Deyu Zhou:

SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation. 10775-10790 - Thanh Duc Pham, Nam Le Hai, Linh Ngo Van, Nguyen Thi Ngoc Diep, Sang Dinh, Thien Huu Nguyen:

Mitigating Non-Representative Prototypes and Representation Bias in Few-Shot Continual Relation Extraction. 10791-10809 - Wei Tao, Haocheng Lu, Xiaoyang Qu, Bin Zhang, Kai Lu, Jiguang Wan, Jianzong Wang:

MoQAE: Mixed-Precision Quantization for Long-Context LLM Inference via Mixture of Quantization-Aware Experts. 10810-10820 - Ziqian Zeng, Jianwei Wang, Junyao Yang, Zhengdong Lu, Haoran Li, Huiping Zhuang, Cen Chen:

PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration. 10821-10855 - Xinlin Zhuang, Jiahui Peng, Ren Ma, Yinfan Wang, Tianyi Bai, Xingjian Wei, Jiantao Qiu, Chi Zhang, Ying Qian, Conghui He:

Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models. 10856-10896 - Qingchen Yu, Zifan Zheng, Ding Chen, Simin Niu, Bo Tang, Feiyu Xiong, Zhiyu Li:

GuessArena: Guess Who I Am? A Self-Adaptive Framework for Evaluating LLMs in Domain-Specific Knowledge and Reasoning. 10897-10912 - Kehua Feng, Keyan Ding, Hongzhi Tan, Kede Ma, Zhihua Wang, Shuangquan Guo, Yuzhou Cheng, Ge Sun, Guozhou Zheng, Qiang Zhang, Huajun Chen:

Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition. 10913-10947 - Guanran Luo, Zhongquan Jian, Wentao Qiu, Meihong Wang, Qingqiang Wu:

DTCRS: Dynamic Tree Construction for Recursive Summarization. 10948-10963 - Zhiyu Zhang, Wei Chen, Youfang Lin, Huaiyu Wan:

A Generative Adaptive Replay Continual Learning Model for Temporal Knowledge Graph Reasoning. 10964-10977 - Yize Zhang, Tianshu Wang, Sirui Chen, Kun Wang, Xingyu Zeng, Hongyu Lin, Xianpei Han, Le Sun, Chaochao Lu:

ARise: Towards Knowledge-Augmented Reasoning via Risk-Adaptive Search. 10978-10995 - Ziyan Wang, Zhankun Xiong, Feng Huang, Wen Zhang:

PKAG-DDI: Pairwise Knowledge-Augmented Language Model for Drug-Drug Interaction Event Text Generation. 10996-11010 - Shuai Niu, Jing Ma, Hongzhan Lin, Liang Bai, Zhihua Wang, Richard Yi Da Xu, Yunya Song, Xian Yang:

Knowledge-Augmented Multimodal Clinical Rationale Generation for Disease Diagnosis with Small Language Models. 11011-11024 - Xindi Li, Zhe Liu, Tong Zhang, Jiahao Chen, Qingming Li, Jinbao Li, Shouling Ji:

TWIST: Text-encoder Weight-editing for Inserting Secret Trojans in Text-to-Image Models. 11025-11041 - Abhijnan Nath, Carine Graff, Andrei Bachinin, Nikhil Krishnaswamy:

Frictional Agent Alignment Framework: Slow Down and Don't Break Things. 11042-11089 - Dongjin Park, Eunsang Lee, Joon-Woo Lee:

Powerformer: Efficient and High-Accuracy Privacy-Preserving Language Model with Homomorphic Encryption. 11090-11111 - Weixiang Zhao, Yulin Hu, Yang Deng, Jiahe Guo, Xingyu Sui, Xinyang Han, An Zhang, Yanyan Zhao, Bing Qin, Tat-Seng Chua, Ting Liu:

Beware of Your Po! Measuring and Mitigating AI Safety Risks in Role-Play Fine-Tuning of LLMs. 11112-11137 - Zihao Li, Lecheng Zheng, Bowen Jin, Dongqi Fu, Baoyu Jing, Yikun Ban, Jingrui He, Jiawei Han:

Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision? 11138-11165 - Hongqiu Wu, Weiqi Wu, Tianyang Xu, Jiameng Zhang, Hai Zhao:

Towards Enhanced Immersion and Agency for LLM-based Interactive Drama. 11166-11182 - Shun Inadumi, Nobuhiro Ueda, Koichiro Yoshino:

Disambiguating Reference in Visually Grounded Dialogues through Joint Modeling of Textual and Multimodal Semantic Structures. 11183-11198 - Mingda Chen, Yang Li, Karthik Padthe, Rulin Shao, Alicia Yi Sun, Luke Zettlemoyer, Gargi Ghosh, Wen-tau Yih:

Improving Factuality with Explicit Working Memory. 11199-11213 - Chengao Li, Hanyu Zhang, Yunkun Xu, Hongyan Xue, Xiang Ao, Qing He:

Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models. 11214-11232 - Yifu Ding, Wentao Jiang, Shunyu Liu, Yongcheng Jing, Jinyang Guo, Yingjie Wang, Jing Zhang, Zengmao Wang, Ziwei Liu, Bo Du, Xianglong Liu, Dacheng Tao:

Dynamic Parallel Tree Search for Efficient LLM Reasoning. 11233-11252 - Junyi Chen, Shihao Bai, Zaijun Wang, Siyu Wu, Chuheng Du, Hailong Yang, Ruihao Gong, Shengzhong Liu, Fan Wu, Guihai Chen:

Pre³: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation. 11253-11267 - Ge Qu, Jinyang Li, Bowen Qin, Xiaolong Li, Nan Huo, Chenhao Ma, Reynold Cheng:

SHARE: An SLM-based Hierarchical Action CorREction Assistant for Text-to-SQL. 11268-11292 - Tao Zhang, Ziqian Zeng, YuxiangXiao YuxiangXiao, Huiping Zhuang, Cen Chen, James R. Foulds, Shimei Pan:

GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models. 11293-11311 - Peng Zhou, Pengsen Ma, Jianmin Wang, Xibao Cai, Haitao Huang, Wei Liu, Longyue Wang, Lai Hou Tim, Xiangxiang Zeng:

Large Language and Protein Assistant for Protein-Protein Interactions Prediction. 11312-11327 - Jiaan Wang, Fandong Meng, Zengkui Sun, Yunlong Liang, Yuxuan Cao, Jiarong Xu, Haoxiang Shi, Jie Zhou:

An Empirical Study of Many-to-Many Summarization with Large Language Models. 11328-11344 - Suhang Wu, Jialong Tang, Chengyi Yang, Pei Zhang, Baosong Yang, Junhui Li, Junfeng Yao, Min Zhang, Jinsong Su:

Locate-and-Focus: Enhancing Terminology Translation in Speech Language Models. 11345-11360 - Lingxiao Diao, Xinyue Xu, Wanxuan Sun, Cheng Yang, Zhuosheng Zhang:

GuideBench: Benchmarking Domain-Oriented Guideline Following for LLM Agents. 11361-11399 - Xinke Jiang, Yue Fang, Rihong Qiu, Haoyu Zhang, Yongxin Xu, Hao Chen, Wentao Zhang, Ruizhe Zhang, Yuchen Fang, Xinyu Ma, Xu Chu, Junfeng Zhao, Yasha Wang:

TC-RAG: Turing-Complete RAG's Case study on Medical LLM Systems. 11400-11426 - Zexiong Ma, Chao Peng, Pengfei Gao, Xiangxin Meng, Yanzhen Zou, Bing Xie:

SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning. 11427-11441 - Zhongzhan Huang, Guoming Ling, Shanshan Zhong, Hefeng Wu, Liang Lin:

MiniLongBench: The Low-cost Long Context Understanding Benchmark for Large Language Models. 11442-11460 - Xin Sun, Jianan Xie, Zhongqi Chen, Qiang Liu, Shu Wu, Yuehe Chen, Bowen Song, Zilei Wang, Weiqiang Wang, Liang Wang:

Divide-Then-Align: Honest Alignment based on the Knowledge Boundary of RAG. 11461-11480 - Wanzong Peng, Lin Ye, Xuetao Du, Hongli Zhang, Dongyang Zhan, Yunting Zhang, Yicheng Guo, Chen Zhang:

PwnGPT: Automatic Exploit Generation Based on Large Language Models. 11481-11494 - Cuc Thi Bui, Nguyen Truong Son, Trang Van Truong, Viet Lam Phung, Pham Nhut Huy, Hoang Anh Le, Quoc Huu Van, Phong Nguyen-Thuan Do, Van Le Tran Truc, Duc Thanh Chau, Le-Minh Nguyen:

VMLU Benchmarks: A comprehensive benchmark toolkit for Vietnamese LLMs. 11495-11515 - Kai Liu, Jianfei Gao, Kai Chen:

Scaling up the State Size of RNN LLMs for Long-Context Scenarios. 11516-11529 - Bocheng Li, Zhujin Gao, Linli Xu:

Unifying Continuous and Discrete Text Diffusion with Non-simultaneous Diffusion Processes. 11530-11551 - Xin Gao, Qizhi Pei, Zinan Tang, Yu Li, Honglin Lin, Jiang Wu, Lijun Wu, Conghui He:

A Strategic Coordination Framework of Small LMs Matches Large LMs in Data Synthesis. 11552-11570 - Wenrui Xu, Dalin Lyu, Weihang Wang, Jie Feng, Chen Gao, Yong Li:

Defining and Evaluating Visual Language Models' Basic Spatial Abilities: A Perspective from Psychometrics. 11571-11590 - Wenyu Zhang, Wei En Ng, Lixin Ma, Yuwen Wang, Junqi Zhao, Allison Koenecke, Boyang Li, Lu Wang:

SPHERE: Unveiling Spatial Blind Spots in Vision-Language Models Through Hierarchical Evaluation. 11591-11609 - Qijun Miao, Zhixuan Fang:

User-side Model Consistency Monitoring for Open Source Large Language Models Inference Services. 11610-11622 - Weixiong Zheng, Peijian Zeng, Yiwei Li, Hongyan Wu, Nankai Lin, Junhao Chen, Aimin Yang, Yongmei Zhou:

Jailbreaking? One Step Is Enough! 11623-11642 - Yongxin Xu, Ruizhe Zhang, Xinke Jiang, Yujie Feng, Yuzhen Xiao, Xinyu Ma, Runchuan Zhu, Xu Chu, Junfeng Zhao, Yasha Wang:

Parenting: Optimizing Knowledge Selection of Retrieval-Augmented Language Models with Parameter Decoupling and Tailored Tuning. 11643-11662 - Yichen He, Guanhua Huang, Peiyuan Feng, Yuan Lin, Yuchen Zhang, Hang Li, Weinan E:

PaSa: An LLM Agent for Comprehensive Academic Paper Search. 11663-11679 - Abhilasha Sancheti, David Dale, Artyom Kozhevnikov, Maha Elbayad:

Less Mature is More Adaptable for Sentence-level Language Modeling. 11680-11695 - Subhajit Chaudhury, Payel Das, Sarathkrishna Swaminathan, Georgios Kollias, Elliot Nelson, Khushbu Pahwa, Tejaswini Pedapati, Igor Melnyk, Matthew Riemer:

EpMAN: Episodic Memory AttentioN for Generalizing to Longer Contexts. 11696-11708 - Xueyan Zhang, Jinman Zhao, Zhifei Yang, Yibo Zhong, Shuhao Guan, Linbo Cao, Yining Wang:

UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter Efficient Fine-Tuning of Large Models. 11709-11728 - Haotian Wang, Yi Guan, Fanshu Meng, Chao Zhao, Lian Yan, Yang Yang, Jingchi Jiang:

Agri-CM³: A Chinese Massive Multi-modal, Multi-level Benchmark for Agricultural Understanding and Reasoning. 11729-11754 - Junnan Zhu, Min Xiao, Yining Wang, Feifei Zhai, Yu Zhou, Chengqing Zong:

TROVE: A Challenge for Fine-Grained Text Provenance via Source Sentence Tracing and Relationship Classification. 11755-11771 - Shane Arora, Marzena Karpinska, Hung-Ting Chen, Ipsita Bhattacharjee, Mohit Iyyer, Eunsol Choi:

CaLMQA: Exploring culturally specific long-form question answering across 23 languages. 11772-11817 - Yushan Zhu, Wen Zhang, Zhiqiang Liu, Mingyang Chen, Lei Liang, Huajun Chen:

Croppable Knowledge Graph Embedding. 11818-11835 - Xinke Jiang, Ruizhe Zhang, Yongxin Xu, Rihong Qiu, Yue Fang, Zhiyuan Wang, Jinyi Tang, Hongxin Ding, Xu Chu, Junfeng Zhao, Yasha Wang:

HyKGE: A Hypothesis Knowledge Graph Enhanced RAG Framework for Accurate and Reliable Medical LLMs Responses. 11836-11856 - Zhiyuan Hu, Yuliang Liu, Jinman Zhao, Suyuchen Wang, WangYan WangYan, Wei Shen, Qing Gu, Anh Tuan Luu, See-Kiong Ng, Zhiwei Jiang, Bryan Hooi:

LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models. 11857-11870 - Naibin Gu, Zhenyu Zhang, Xiyu Liu, Peng Fu, Zheng Lin, Shuohuan Wang, Yu Sun, Hua Wu, Weiping Wang, Haifeng Wang:

BeamLoRA: Beam-Constraint Low-Rank Adaptation. 11871-11883 - Yiming Lei, Chenkai Zhang, Zeming Liu, Haitao Leng, Shaoguo Liu, Tingting Gao, Qingjie Liu, Yunhong Wang:

GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art. 11884-11952 - Ang Li, Yiquan Wu, Yifei Liu, Ming Cai, Lizhi Qing, Shihang Wang, Yangyang Kang, Chengyuan Liu, Fei Wu, Kun Kuang:

UniLR: Unleashing the Power of LLMs on Multiple Legal Tasks with a Unified Legal Retriever. 11953-11967 - Haoran Ye, Tianze Zhang, Yuhang Xie, Liyuan Zhang, Yuanyi Ren, Xin Zhang, Guojie Song:

Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models. 11968-11991 - Yeyong Yu, Runsheng Yu, Haojie Wei, Zhanqiu Zhang, Quan Qian:

Beyond Dialogue: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model. 11992-12022 - Huaye Zeng, Dongfu Jiang, Haozhe Wang, Ping Nie, Xiaotong Chen, Wenhu Chen:

ACECODER: Acing Coder RL via Automated Test-Case Synthesis. 12023-12040 - Hang Chen, Xinyu Yang, Jiaying Zhu, Wenya Wang:

Quantifying Semantic Emergence in Language Models. 12041-12054 - Jizheng Chen, Kounianhua Du, Xinyi Dai, Weiming Zhang, Xihuai Wang, Yasheng Wang, Ruiming Tang, Weinan Zhang, Yong Yu:

DebateCoder: Towards Collective Intelligence of LLMs via Test Case Driven LLM Debate for Code Generation. 12055-12065 - Chen Qian, Dongrui Liu, Jie Zhang, Yong Liu, Jing Shao:

The Tug of War Within: Mitigating the Fairness-Privacy Conflicts in Large Language Models. 12066-12095 - Yukun Cao, Shuo Han, Zengyi Gao, Zezhong Ding, Xike Xie, S. Kevin Zhou:

GraphInsight: Unlocking Insights in Large Language Models for Graph Structure Understanding. 12096-12134 - Michael S. Yantosca, Albert M. K. Cheng:

Phonotomizer: A Compact, Unsupervised, Online Training Approach to Real-Time, Multilingual Phonetic Segmentation. 12135-12147 - Bojun Jin, Jianzhu Bao, Yufang Hou, Yang Sun, Yice Zhang, Huajie Wang, Bin Liang, Ruifeng Xu:

A Multi-persona Framework for Argument Quality Assessment. 12148-12170 - Chengwu Liu, Ye Yuan, Yichun Yin, Yan Xu, Xin Xu, Zaoyu Chen, Yasheng Wang, Lifeng Shang, Qun Liu, Ming Zhang:

Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification. 12171-12186 - Yuxuan Hu, Ke Wang, Xiaokang Zhang, Fanjin Zhang, Cuiping Li, Hong Chen, Jing Zhang:

SAM Decoding: Speculative Decoding via Suffix Automaton. 12187-12204 - Yuxin Hu, Danni Liu, Bo Liu, Yida Chen, Jiuxin Cao, Yan Liu:

PsyAdvisor: A Plug-and-Play Strategy Advice Planner with Proactive Questioning in Psychological Conversations. 12205-12229 - Silin Li, Yuhang Guo, Jiashu Yao, Zeming Liu, Haifeng Wang:

HomeBench: Evaluating LLMs in Smart Homes with Valid and Invalid Instructions Across Single and Multiple Devices. 12230-12250 - Xueyao Zhang, Yuancheng Wang, Chaoren Wang, Ziniu Li, Zhuo Chen, Zhizheng Wu:

Advancing Zero-shot Text-to-Speech Intelligibility across Diverse Domains via Preference Alignment. 12251-12270 - Haochen Li, Wanjin Feng, Xin Zhou, Zhiqi Shen:

GiFT: Gibbs Fine-Tuning for Code Generation. 12271-12284 - Yiwen Jiang, Deval Mehta, Wei Feng, Zongyuan Ge:

Enhancing Interpretable Image Classification Through LLM Agents and Conditional Concept Bottleneck Models. 12285-12297 - Xiaowei Zhu, Yubing Ren

, Yanan Cao, Xixun Lin, Fang Fang, Yangxi Li:
Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction. 12298-12319 - Junsik Kim, Jinwook Park, Kangil Kim:

RSCF: Relation-Semantics Consistent Filter for Entity Embedding of Knowledge Graph. 12320-12336 - Pinyi Zhang, Siyu An, Lingfeng Qiao, Yifei Yu, Jingyang Chen, Jie Wang, Di Yin, Xing Sun, Kai Zhang:

RolePlot: A Systematic Framework for Evaluating and Enhancing the Plot-Progression Capabilities of Role-Playing Agents. 12337-12354 - Zhenyu Hou, Ziniu Hu, Yujiang Li, Rui Lu, Jie Tang, Yuxiao Dong:

TreeRL: LLM Reinforcement Learning with On-Policy Tree Search. 12355-12369 - Emre Can Acikgoz, Jeremiah Greer, Akul Datta, Ze Yang, William Zeng, Oussama Elachqar, Emmanouil Koukoumidis, Dilek Hakkani-Tür, Gokhan Tur:

Can a Single Model Master Both Multi-turn Conversations and Tool Use? CoALM: A Unified Conversational Agentic Language Model. 12370-12390 - Yupu Liang, Yaping Zhang, Zhiyang Zhang, Yang Zhao, Lu Xiang, Chengqing Zong, Yu Zhou:

Single-to-mix Modality Alignment with Multimodal Large Language Model for Document Image Machine Translation. 12391-12408 - Aobo Kong, Wentao Ma, Shiwan Zhao, Yongbin Li, Yuchuan Wu, Ke Wang, Xiaoqian Liu, Qicheng Li, Yong Qin, Fei Huang:

SDPO: Segment-Level Direct Preference Optimization for Social Agents. 12409-12423 - Zhiyang Qi, Takumasa Kaneko, Keiko Takamizo, Mariko Ukiyo, Michimasa Inaba:

KokoroChat: A Japanese Psychological Counseling Dialogue Dataset Collected via Role-Playing by Trained Counselors. 12424-12443 - Xiangchao Yan, Shiyang Feng, Jiakang Yuan, Renqiu Xia, Bin Wang, Lei Bai, Bo Zhang:

SURVEYFORGE : On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing. 12444-12465 - Yexing Du, Youcheng Pan, Ziyang Ma, Bo Yang, Yifan Yang, Keqi Deng, Xie Chen, Yang Xiang, Ming Liu, Bing Qin:

Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum Learning. 12466-12478 - Yilun Zhao, Weiyuan Chen, Zhijian Xu, Manasi Patwardhan, Chengye Wang, Yixin Liu, Lovekesh Vig, Arman Cohan:

AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research. 12479-12491 - Zicheng Zhang, Xiangyu Zhao, Xinyu Fang, Chunyi Li, Xiaohong Liu, Xiongkuo Min, Haodong Duan, Kai Chen, Guangtao Zhai:

Redundancy Principles for MLLMs Benchmarks. 12492-12504 - Yifu Chen, Shengpeng Ji, Haoxiao Wang, Ziqing Wang, Siyu Chen, Jinzheng He, Jin Xu, Zhou Zhao:

WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models. 12505-12523 - Jiaming Zhou, Shiyao Wang, Shiwan Zhao, Jiabei He, Haoqin Sun, Hui Wang, Cheng Liu, Aobo Kong, Yujie Guo, Xi Yang, Yequan Wang, Yonghua Lin, Yong Qin:

ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5. 12524-12537 - Yao Xiao, Hai Ye, Linyao Chen, Hwee Tou Ng, Lidong Bing, Xiaoli Li, Roy Ka-Wei Lee:

Finding the Sweet Spot: Preference Data Construction for Scaling Preference Optimization. 12538-12552 - Yuhao Wang, Keyan Ding, Kehua Feng, Zeyuan Wang, Ming Qin, Xiaotong Li, Qiang Zhang, Huajun Chen:

Enhancing Safe and Controllable Protein Generation via Knowledge Preference Optimization. 12553-12569 - Mingqing Zhang, Qiang Liu, Xiang Tao, Shu Wu, Liang Wang:

SINCon: Mitigate LLM-Generated Malicious Message Injection Attack for Rumor Detection. 12570-12581 - Jungwoo Park, Taewhoo Lee, Chanwoong Yoon, Hyeon Hwang, Jaewoo Kang:

Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models. 12582-12600 - Shuofei Qiao, Zhisong Qiu, Baochang Ren, Xiaobin Wang, Xiangyuan Ru, Ningyu Zhang, Xiang Chen, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen:

Agentic Knowledgeable Self-awareness. 12601-12625 - Jifang Wang, Yangxue Yangxue, Longyue Wang, Zhenran Xu, Yiyu Wang, Yaowei Wang, Weihua Luo, Kaifu Zhang, Baotian Hu, Min Zhang:

A Unified Agentic Framework for Evaluating Conditional Image Generation. 12626-12646 - Chao Lei, Yanchuan Chang, Nir Lipovetzky, Krista A. Ehinger:

Planning-Driven Programming: A Large Language Model Programming Workflow. 12647-12684 - Yuan Sui, Yufei He, Zifeng Ding, Bryan Hooi:

Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study Over Open-ended Question Answering. 12685-12701 


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID