


default search action
NAACL-HLT Findings 2025: Albuquerque, New Mexico, USA
- Luis Chiruzzo, Alan Ritter, Lu Wang:
Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29 - May 4, 2025. Association for Computational Linguistics 2025, ISBN 979-8-89176-195-7 - Ranran Haoran Zhang, Bensu Uçar, Soumik Dey, Hansi Wu, Binbin Li, Rui Zhang:
From Lazy to Prolific: Tackling Missing Labels in Open Vocabulary Extreme Classification by Positive-Unlabeled Sequence Learning. 1-16 - Pucheng Dang, Xing Hu, Dong Li, Rui Zhang, Qi Guo, Kaidi Xu:
DiffZOO: A Purely Query-Based Black-Box Attack for Red-teaming Text-to-Image Generative Model via Zeroth Order Optimization. 17-31 - Yongqi Fan, Hongli Sun, Kui Xue, Xiaofan Zhang, Shaoting Zhang, Tong Ruan:
MedOdyssey: A Medical Domain Benchmark for Long Context Evaluation Up to 200K Tokens. 32-56 - Almog Gueta, Amir Feder, Zorik Gekhman, Ariel Goldstein, Roi Reichart:
Can LLMs Learn Macroeconomic Narratives from Social Media? 57-78 - Leonidas Gee, Milan Gritta, Gerasimos Lampouras, Ignacio Iacobacci:
Code-Optimise: Self-Generated Preference Data for Correctness and Efficiency. 79-94 - Junhyuk Choi, Yeseon Hong, Bugeun Kim:
People will agree what I think: Investigating LLM's False Consensus Effect. 95-126 - Joel Niklaus, Lucia Zheng, Arya D. McCarthy, Christopher Hahn, Brian M. Rosen, Peter Henderson, Daniel E. Ho, Garrett Honke, Percy Liang, Christopher D. Manning:
LawInstruct: A Resource for Studying Language Model Adaptation to the Legal Domain. 127-152 - Hao Yang, Hongyuan Lu, Xinhua Zeng, Yang Liu, Xiang Zhang, Haoran Yang, Yumeng Zhang, Shan Huang, Yiran Wei, Wai Lam:
Stephanie: Step-by-Step Dialogues for Mimicking Human Interactions in Social Conversations. 153-166 - Clare Arrington, Maurício Gruppi, Sibel Adali:
ConShift: Sense-based Language Variation Analysis using Flexible Alignment. 167-181 - Jieming Cao, Chen Huang, Yanan Zhang, Ruibo Deng, Jincheng Zhang, Wenqiang Lei:
Breaking the Stigma! Unobtrusively Probe Symptoms in Depression Disorder Diagnosis Dialogue. 182-200 - Tinh Son Luong, Thanh-Thien Le, Thang Viet Doan, Linh Ngo Van, Thien Huu Nguyen, Nguyen Thi Ngoc Diep:
ToVo: Toxicity Taxonomy via Voting. 201-212 - Tianyi Li, Erenay Dayanik, Shubhi Tyagi, Andrea Pierleoni:
HALLUCANA: Fixing LLM Hallucination with A Canary Lookahead. 213-230 - Xin Liu, Aoyang Zhou, Kun He:
Enhancing Adversarial Transferability in Visual-Language Pre-training Models via Local Shuffle and Sample-based Attack. 231-245 - Ieva Staliunaite, Andreas Vlachos:
Dis2Dis: Explaining Ambiguity in Fact-Checking. 246-267 - Xiyao Wang, Jiuhai Chen, Zhaoyang Wang, Yuhang Zhou, Yiyang Zhou, Huaxiu Yao, Tianyi Zhou, Tom Goldstein, Parminder Bhatia, Taha A. Kass-Hout, Furong Huang, Cao Xiao:
Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement. 268-282 - Peiran Wang, Xiaogeng Liu, Chaowei Xiao:
RePD: Defending Jailbreak Attack through a Retrieval-based Prompt Decomposition Process. 283-294 - Chuang Li, Yang Deng, Hengchang Hu, Min-Yen Kan, Haizhou Li:
ChatCRS: Incorporating External Knowledge and Goal Guidance for LLM-based Conversational Recommender Systems. 295-312 - Zehan Wang, Haifeng Huang, Yang Zhao, Ziang Zhang, Tao Jin, Zhou Zhao:
Data-Efficiently Learn Large Language Model for Universal 3D Scene Perception. 313-333 - Zhaowei Li, Wei Wang, Yiqing Cai, Qi Xu, Pengyu Wang, Dong Zhang, Hang Song, Botian Jiang, Zhida Huang, Tao Wang:
UnifiedMLLM: Enabling Unified Representation for Multi-modal Multi-tasks With Large Language Model. 334-344 - Chen Lin, Fei Li, Donghong Ji, Chong Teng:
PEMV: Improving Spatial Distribution for Emotion Recognition in Conversations Using Proximal Emotion Mean Vectors. 345-357 - Xuming Hu, Xiao Qin, Chuan Lei, Asterios Katsifodimos, Zhengyuan Shen, Balasubramaniam Srinivasan, Huzefa Rangwala:
DiscoverGPT: Multi-task Fine-tuning Large Language Model for Related Table Discovery. 358-373 - Takehiro Takayanagi, Hiroya Takamura, Kiyoshi Izumi, Chung-Chi Chen:
Can GPT-4 Sway Experts' Investment Decisions? 374-383 - Xuming Hu, Chuan Lei, Xiao Qin, Asterios Katsifodimos, Christos Faloutsos, Huzefa Rangwala:
PolyJoin: Semantic Multi-key Joinable Table Search in Data Lakes. 384-395 - Dapeng Jiang, Xiao Luo:
Marrying LLMs with Dynamic Forecasting: A Graph Mixture-of-expert Perspective. 396-410 - Minbin Huang, Yanxin Long, Xinchi Deng, Ruihang Chu, Jiangfeng Xiong, Xiaodan Liang, Hong Cheng, Qinglin Lu, Wei Liu:
DialogGen: Multi-modal Interactive Dialogue System with Multi-turn Text-Image Generation. 411-426 - T. Y. S. S. Santosh, Chen Jia, Patrick Goroncy, Matthias Grabmair:
RELexED: Retrieval-Enhanced Legal Summarization with Exemplar Diversity. 427-434 - Shangda Wu, Yashan Wang, Ruibin Yuan, Zhancheng Guo, Xu Tan, Ge Zhang, Monan Zhou, Jing Chen, Xuefeng Mu, Yuejie Gao, Yuanliang Dong, Jiafeng Liu, Xiaobing Li, Feng Yu, Maosong Sun:
CLaMP 2: Multimodal Music Information Retrieval Across 101 Languages Using Large Language Models. 435-451 - Xin Huang, Ting Zhang, Wen Zhao:
LogRules: Enhancing Log Analysis Capability of Large Language Models through Rules. 452-470 - Yingqiang Gao, Lukas Fischer, Alexa Lintner, Sarah Ebling:
Audio Description Generation in the Era of LLMs and VLMs: A Review of Transferable Generative AI Technologies. 471-490 - Xi Wang, Procheta Sen, Ruizhe Li, Emine Yilmaz:
Adaptive Retrieval-Augmented Generation for Conversational Systems. 491-503 - Junxiang Qiu, Jinda Lu, Shuo Wang:
Multimodal Generation with Consistency Transferring. 504-513 - Stephen Meisenbacher, Maulik Chevli, Florian Matthes:
On the Impact of Noise in Differentially Private Text Rewriting. 514-532 - Zhen Qian, Xiuzhen Zhang, Xiaofei Xu, Feng Xia:
Teaching Large Language Models Number-Focused Headline Generation With Key Element Rationales. 533-550 - Fabian Retkowski, Alexander Waibel:
Zero-Shot Strategies for Length-Controllable Summarization. 551-572 - Wonjoong Kim, Sangwu Park, Yeonjun In, Seokwon Han, Chanyoung Park:
SIMPLOT: Enhancing Chart Question Answering by Distilling Essentials. 573-593 - Shufan Li, Harkanwar Singh, Aditya Grover:
InstructAny2Pix: Image Editing with Multi-Modal Prompts. 594-619 - Yiyang Luo, Ke Lin, Chao Gu, Jiahui Hou, Lijie Wen, Luo Ping:
Lost in Overlap: Exploring Logit-based Watermark Collision in LLMs. 620-637 - Yejin Jeon, Youngjae Kim, Jihyun Lee, Gary Lee:
Prompt-Guided Selective Masking Loss for Context-Aware Emotive Text-to-Speech. 638-650 - Ruizhe Chen, Yichen Li, Jianfei Yang, Yang Feng, Joey Tianyi Zhou, Jian Wu, Zuozhu Liu:
Identifying and Mitigating Social Bias Knowledge in Language Models. 651-672 - Sathya Krishnan Suresh, Mengjun Wu, Tushar Pranav, Engsiong Chng:
DiaSynth: Synthetic Dialogue Generation Framework for Low Resource Dialogue Applications. 673-690 - Duygu Nur Yaldiz, Yavuz Faruk Bakman, Baturalp Buyukates, Chenyang Tao, Anil Ramakrishna, Dimitrios Dimitriadis, Jieyu Zhao, Salman Avestimehr:
Do Not Design, Learn: A Trainable Scoring Function for Uncertainty Estimation in Generative LLMs. 691-713 - Jianpeng Hu, Chao Xue, Chunqing Yu, Jiacheng Xu, Chengxiang Tan:
Joint Learning Event-Specific Probe and Argument Library with Differential Optimization for Document-Level Multi-Event Extraction. 714-726 - Yichen Yang, Xin Liu, Kun He:
Synonym-unaware Fast Adversarial Training against Textual Adversarial Attacks. 727-739 - Mayank Nagda, Phil Ostheimer, Sophie Fellenz:
Tethering Broken Themes: Aligning Neural Topic Models with Labels and Authors. 740-760 - Matthieu Futeral, Cordelia Schmid, Benoît Sagot, Rachel Bawden:
Towards Zero-Shot Multimodal Machine Translation. 761-778 - Yang Liu, Lan Lan, Jiahuan Cao, Hiuyi Cheng, Kai Ding, Lianwen Jin:
Large-Scale Corpus Construction and Retrieval-Augmented Generation for Ancient Chinese Poetry: New Method and Data Insights. 779-817 - Alessio Cocchieri, Giacomo Frisoni, Marcos Martínez Galindo, Gianluca Moro, Giuseppe Tagliavini, Francesco Candoli:
OpenBioNER: Lightweight Open-Domain Biomedical Named Entity Recognition Through Entity Type Description. 818-837 - Ryan Soh-Eun Shim, Barbara Plank:
Dialetto, ma Quanto Dialetto? Transcribing and Evaluating Dialects on a Continuum. 838-849 - Marcell Fekete, Johannes Bjerva:
Linguistically Grounded Analysis of Language Models using Shapley Head Values. 850-865 - Xiao Luo, Binqi Chen, Haixin Wang, Zhiping Xiao, Ming Zhang, Yizhou Sun:
How Do Large Language Models Perform in Dynamical System Modeling. 866-880 - Kaichen Zhang, Bo Li, Peiyuan Zhang, Fanyi Pu, Joshua Adrian Cahyono, Kairui Hu, Shuai Liu, Yuanhan Zhang, Jingkang Yang, Chunyuan Li, Ziwei Liu:
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models. 881-916 - Xiaotong Zhang, Qianru Zhou, Han Liu, Hong Yu:
Pairwise Prompt-Based Tuning with Parameter Efficient Fast Adaptation for Generalized Zero-Shot Intent Detection. 917-929 - Zhuang Luo, Yichuan Li, Zexing Xu, Kyumin Lee, S. Rasoul Etesami:
FaithfulPersona: Balancing Faithfulness and Personalization in Code Explanations through Self-Critique. 930-944 - Wei Zhou, Mohsen Mesgar, Annemarie Friedrich, Heike Adel:
Efficient Multi-Agent Collaboration with Tool Use for Online Planning in Complex Table Question Answering. 945-968 - Sirui Xia, Xintao Wang, Jiaqing Liang, Yifei Zhang, Weikang Zhou, Jiaji Deng, Fei Yu, Yanghua Xiao:
Ground Every Sentence: Improving Retrieval-Augmented LLMs with Interleaved Reference-Claim Generation. 969-988 - Lindsey Vanderlyn, Dirk Väth, Ngoc Thang Vu:
Understanding the Role of Mental Models in User Interaction with an Adaptive Dialog Agent. 989-1015 - T. Y. S. S. Santosh, Youssef Farag, Matthias Grabmair:
CoPERLex: Content Planning with Event-based Representations for Legal Case Summarization. 1016-1032 - Quancai Liu, Haihui Fan, Jinchao Zhang, Xiangfang Li, Chuanrong Li, Bo Li:
DisComp: A Two-Stage Prompt Optimization Framework Combining Task-Agnostic and Task-Aware Compression. 1033-1044 - Sang Quang Nguyen, Kiet Van Nguyen:
A Large-Scale Benchmark for Vietnamese Sentence Paraphrases. 1045-1060 - Yang Bai, Christan Grant, Daisy Zhe Wang:
RAMQA: A Unified Framework for Retrieval-Augmented Multi-Modal Question Answering. 1061-1076 - Adarsh Pyarelal, John Culnan, Ayesha Qamar, Meghavarshini Krishnaswamy, Yuwei Wang, Cheonkam Jeong, Chen Chen, Md Messal Monem Miah, Shahriar Hormozi, Jonathan Tong, Ruihong Huang:
MultiCAT: Multimodal Communication Annotations for Teams. 1077-1111 - Dinghao Pan, Yuanyuan Sun, Bo Xu, Jiru Li, Zhihao Yang, Ling Luo, Hongfei Lin, Jian Wang:
Prototype Tuning: A Meta-Learning Approach for Few-Shot Document-Level Relation Extraction with Large Language Models. 1112-1128 - Shubham Kumar Nigam, Tanmay Dubey, Govind Sharma, Noel Shallum, Kripabandhu Ghosh, Arnab Bhattacharya:
LegalSeg: Unlocking the Structure of Indian Legal Judgments Through Rhetorical Role Classification. 1129-1144 - Minkyoo Song, Hanna Kim, Jaehan Kim, Youngjin Jin, Seungwon Shin:
Claim-Guided Textual Backdoor Attack for Practical Applications. 1145-1159 - Jiarui Lu, Thomas Holleis, Yizhe Zhang, Bernhard Aumayer, Feng Nan, Haoping Bai, Shuang Ma, Shen Ma, Mengyu Li, Guoli Yin, Zirui Wang, Ruoming Pang:
ToolSandbox: A Stateful, Conversational, Interactive Evaluation Benchmark for LLM Tool Use Capabilities. 1160-1183 - Qilong Wu, Xiaoneng Xiang, Hejia Huang, Xuan Wang, Wei Jie Yeo, Ranjan Satapathy, Ricardo Shirota Filho, Bharadwaj Veeravalli:
SusGen-GPT: A Data-Centric LLM for Financial NLP and Sustainability Report Generation. 1184-1203 - Daniil Gurgurov, Rishu Kumar, Simon Ostermann:
GrEmLIn: A Repository of Green Baseline Embeddings for 87 Low-Resource Languages Injected with Multilingual Graph Knowledge. 1204-1221 - Armel Randy Zebaze, Benoît Sagot, Rachel Bawden:
In-Context Example Selection via Similarity Search Improves Low-Resource Machine Translation. 1222-1252 - Ne Luo, Aryo Pradipta Gema, Xuanli He, Emile van Krieken, Pietro Lesci, Pasquale Minervini:
Self-Training Large Language Models for Tool-Use Without Demonstrations. 1253-1271 - Lekang Jiang, Caiqi Zhang, Pascal A Scherz, Stefan Goetz:
Can Large Language Models Generate High-quality Patent Claims? 1272-1287 - Jaehan Kim, Minkyoo Song, Seung Ho Na, Seungwon Shin:
Obliviate: Neutralizing Task-agnostic Backdoors within the Parameter-efficient Fine-tuning Paradigm. 1288-1307 - Yiruo Cheng, Kelong Mao, Ziliang Zhao, Guanting Dong, Hongjin Qian, Yongkang Wu, Tetsuya Sakai, Ji-Rong Wen, Zhicheng Dou:
CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmented Generation. 1308-1330 - Itai Mondshine, Tzuf Paz-Argaman, Reut Tsarfaty:
Beyond English: The Impact of Prompt Translation Strategies across Languages and Tasks in Multilingual LLMs. 1331-1354 - Varun Nagaraj Rao, Eesha Agarwal, Samantha Dalal, Dana Calacci, Andrés Monroy-Hernández:
QuaLLM: An LLM-based Framework to Extract Quantitative Insights from Online Forums. 1355-1369 - Tomás Horych, Christoph Mandl, Terry Ruas, André Greiner-Petter, Bela Gipp, Akiko Aizawa, Timo Spinde:
The Promises and Pitfalls of LLM Annotations in Dataset Labeling: a Case Study on Media Bias Detection. 1370-1386 - Lin Zhang, Lijie Hu, Di Wang:
Mechanistic Unveiling of Transformer Circuits: Self-Influence as a Key to Model Reasoning. 1387-1404 - Yuyi Huang, Runzhe Zhan, Derek F. Wong, Lidia S. Chao, Ailin Tao:
Intrinsic Model Weaknesses: How Priming Attacks Unveil Vulnerabilities in Large Language Models. 1405-1425 - Soichiro Murakami, Peinan Zhang, Hidetaka Kamigaito, Hiroya Takamura, Manabu Okumura:
AdParaphrase: Paraphrase Dataset for Analyzing Linguistic Features toward Generating Attractive Ad Texts. 1426-1439 - Falko Helm, Nico Daheim, Iryna Gurevych:
Token Weighting for Long-Range Language Modeling. 1440-1459 - Takyoung Kim, Kyungjae Lee, Young Rok Jang, Ji Yong Cho, Gangwoo Kim, Minseok Cho, Moontae Lee:
Learning to Explore and Select for Coverage-Conditioned Retrieval-Augmented Generation. 1460-1480 - Zhiwen Ruan, Yixia Li, He Zhu, Longyue Wang, Weihua Luo, Kaifu Zhang, Yun Chen, Guanhua Chen:
LayAlign: Enhancing Multilingual Reasoning in Large Language Models via Layer-Wise Adaptive Fusion and Alignment Strategy. 1481-1495 - Nam Le Hai, Dung Manh Nguyen, Nghi D. Q. Bui:
On the Impacts of Contexts on Repository-Level Code Generation. 1496-1524 - Moritz Plenz, Philipp Heinisch, Janosch Gehring, Philipp Cimiano, Anette Frank:
From Argumentation to Deliberation: Perspectivized Stance Vectors for Fine-grained (Dis)agreement Analysis. 1525-1553 - Souvik Kundu, Anahita Bhiwandiwalla, Sungduk Yu, Phillip Howard, Tiep Le, Sharath Nittur Sridhar, David Cobbley, Hao Kang, Vasudev Lal:
LVLM-Compress-Bench: Benchmarking the Broader Impact of Large Vision-Language Model Compression. 1554-1570 - David Ifeoluwa Adelani, A. Seza Dogruöz, Iyanuoluwa Shode, Anuoluwapo Aremu:
Does Generative AI speak Nigerian-Pidgin?: Issues about Representativeness and Bias for Multilingualism in LLMs. 1571-1583 - Gaurav Sahu, Olga Vechtomova, Issam H. Laradji:
A Guide To Effectively Leveraging LLMs for Low-Resource Text Summarization: Data Augmentation and Semi-supervised Approaches. 1584-1603 - Aashiq Muhamed, Mona T. Diab, Virginia Smith:
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models. 1604-1635 - Oana Ignat, Xiaomeng Xu, Rada Mihalcea:
MAiDE-up: Multilingual Deception Detection of AI-generated Hotel Reviews. 1636-1653 - T. Y. S. S. Santosh, Isaac Misael Olguín Nolasco, Matthias Grabmair:
LeCoPCR: Legal Concept-guided Prior Case Retrieval for European Court of Human Rights cases. 1654-1661 - Simeng Sun, Cheng-Ping Hsieh:
How much do contextualized representations encode long-range context? 1662-1679 - Pengfei He, Han Xu, Yue Xing, Hui Liu, Makoto Yamada, Jiliang Tang:
Data Poisoning for In-context Learning. 1680-1700 - Adil Soubki, John Murzaku, Peter Zeng, Owen Rambow:
Synthetic Audio Helps for Cognitive State Tasks. 1701-1708 - Prasanth Bathala, Christophe Ye, Batuhan Nursal, Shubham Lohiya, David Kartchner, Cassie S. Mitchell:
BioEL: A Comprehensive Python Package for Biomedical Entity Linking. 1709-1721 - Rupak Sarkar, Patrick Y. Wu, Kristina Miler, Alexander Miserlis Hoyle, Philip Resnik:
PairScale: Analyzing Attitude Change with Pairwise Comparisons. 1722-1738 - Chenyu Wang, Weichao Zhou, Shantanu Ghosh, Kayhan Batmanghelich, Wenchao Li:
Semantic Consistency-Based Uncertainty Quantification for Factuality in Radiology Report Generation. 1739-1754 - Nathan Lambert, Valentina Pyatkin, Jacob Morrison, Lester James V. Miranda, Bill Yuchen Lin, Khyathi Raghavi Chandu, Nouha Dziri, Sachin Kumar, Tom Zick, Yejin Choi, Noah A. Smith, Hannaneh Hajishirzi:
RewardBench: Evaluating Reward Models for Language Modeling. 1755-1797 - Sree Bhattacharyya, James Z. Wang:
Evaluating Vision-Language Models for Emotion Recognition. 1798-1820 - Crystina Zhang, Jing Lu, Vinh Q. Tran, Tal Schuster, Donald Metzler, Jimmy Lin:
Tomato, Tomahto, Tomate: Do Multilingual Language Models Understand Based on Subword-Level Semantic Concepts? 1821-1837 - Siyi Liu, Qiang Ning, Kishaloy Halder, Zheng Qi, Wei Xiao, Phu Mon Htut, Yi Zhang, Neha Anna John, Bonan Min, Yassine Benajiba, Dan Roth:
Open Domain Question Answering with Conflicting Contexts. 1838-1854 - Artem Kirsanov, Chi-Ning Chou, Kyunghyun Cho, SueYeon Chung:
The Geometry of Prompting: Unveiling Distinct Mechanisms of Task Adaptation in Language Models. 1855-1888 - Pedro Cisneros-Velarde:
Biases in Opinion Dynamics in Multi-Agent Systems of Large Language Models: A Case Study on Funding Allocation. 1889-1916 - Mourad Heddaya, Kyle MacMillan, Hongyuan Mei, Chenhao Tan, Anup Malani:
CaseSumm: A Large-Scale Dataset for Long-Context Summarization from U.S. Supreme Court Opinions. 1917-1942 - Harshita Diddee, Daphne Ippolito:
Chasing Random: Instruction Selection Strategies Fail to Generalize. 1943-1957 - Manveer Singh Tamber, Jasper Xian, Jimmy Lin:
Can't Hide Behind the API: Stealing Black-Box Commercial Embedding Models. 1958-1969 - Sara Ghaboura, Ahmed Heakl, Omkar Thawakar, Ali Husain Salem Abdulla Alharthi, Ines Riahi, Abduljalil Radman, Jorma Laaksonen, Fahad Shahbaz Khan, Salman Khan, Rao Muhammad Anwer:
CAMEL-Bench: A Comprehensive Arabic LMM Benchmark. 1970-1980 - David Anugraha, Genta Indra Winata, Chenyue Li, Patrick Amadeus Irawan, En-Shiun Annie Lee:
ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models. 1981-2011 - Giang Do, Hung Le, Truyen Tran:
SimSMoE: Toward Efficient Training Mixture of Experts via Solving Representational Collapse. 2012-2025 - Sahel Sharifymoghaddam, Shivani Upadhyay, Wenhu Chen, Jimmy Lin:
UniRAG: Universal Retrieval Augmentation for Large Vision Language Models. 2026-2039 - Behrad Moniri, Hamed Hassani, Edgar Dobriban:
Evaluating the Performance of Large Language Models via Debates. 2040-2075 - Ivaxi Sheth, Bahare Fatemi, Mario Fritz:
CausalGraph2LLM: Evaluating LLMs for Causal Queries. 2076-2098 - Hammad A. Ayyubi, Xuande Feng, Junzhang Liu, Xudong Lin, Zhecan Wang, Shih-Fu Chang:
PuzzleGPT: Emulating Human Puzzle-Solving Ability for Time and Location Prediction. 2099-2116 - Ruidi Chang, Chunyuan Deng, Hanjie Chen:
SAFR: Neuron Redistribution for Interpretability. 2117-2126 - Yuyang Jiang, Chacha Chen, Dang Nguyen, Benjamin M. Mervak, Chenhao Tan:
GPT-4V Cannot Generate Radiology Reports Yet. 2127-2154 - Renyi Qu, Ruixuan Tu, Forrest Sheng Bao:
Is Semantic Chunking Worth the Computational Cost? 2155-2177 - Abdulla Alshabanah, Murali Annavaram:
On Using Arabic Language Dialects in Recommendation Systems. 2178-2186 - Hadi Askari, Anshuman Chhabra, Muhao Chen, Prasant Mohapatra:
Assessing LLMs for Zero-shot Abstractive Summarization Through the Lens of Relevance Paraphrasing. 2187-2201 - Zehui Wu, Ziwei Gong, Lin Ai, Pengyuan Shi, Kaan Donbekci, Julia Hirschberg:
Beyond Silent Letters: Amplifying LLMs in Emotion Recognition with Vocal Nuances. 2202-2218 - Haohan Yuan, Haopeng Zhang:
DomainSum: A Hierarchical Benchmark for Fine-Grained Domain Shift in Abstractive Text Summarization. 2219-2231 - Wenjie Jacky Mo, Jiashu Xu, Qin Liu, Jiongxiao Wang, Jun Yan, Hadi Askari, Chaowei Xiao, Muhao Chen:
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations. 2232-2249 - Michael Hardy:
"All that Glitters": Techniques for Evaluations with Unreliable Model and Human Annotations. 2250-2278 - Xiaoming Shi, Zeming Liu, Yiming Lei, Chenkai Zhang, Haitao Leng, Chuan Wang, Qingjie Liu, Wanxiang Che, Yunhong Wang:
KwaiChat: A Large-Scale Video-Driven Multilingual Mixed-Type Dialogue Corpus. 2279-2294 - Raghuveer Thirukovalluru, Bhuwan Dhingra:
GenEOL: Harnessing the Generative Power of LLMs for Training-Free Sentence Embeddings. 2295-2308 - Kuo-Han Hung, Ching-Yun Ko, Ambrish Rawat, I-Hsin Chung, Winston H. Hsu, Pin-Yu Chen:
Attention Tracker: Detecting Prompt Injection Attacks in LLMs. 2309-2322 - Tianshu Yu, Zihan Gong, Minghuan Tan, Guhong Chen, Min Yang:
Unsupervised Speech-text word-level alignment with Dynamic Programming. 2323-2334 - Hengxing Cai, Xiaochen Cai, Junhan Chang, Sihang Li, Lin Yao, Changxin Wang, Zhifeng Gao, Hongshuai Wang, Yongge Li, Mujie Lin, Shuwen Yang, Jiankun Wang, Mingjun Xu, Jin Huang, Xi Fang, Jiaxi Zhuang, Yuqi Yin, Yaqi Li, Changhong Chen, Zheng Cheng, Zifeng Zhao, Linfeng Zhang, Guolin Ke:
SciAssess: Benchmarking LLM Proficiency in Scientific Literature Analysis. 2335-2357 - Samuele Poppi, Zheng-Xin Yong, Yifei He, Bobbie Chern, Han Zhao, Aobo Yang, Jianfeng Chi:
Towards Understanding the Fragility of Multilingual LLMs against Fine-Tuning Attacks. 2358-2372 - Xingjian Zhang, Yutong Xie, Jin Huang, Jinge Ma, Zhaoying Pan, Qijia Liu, Ziyang Xiong, Tolga Ergen, Dongsub Shim, Honglak Lee, Qiaozhu Mei:
MASSW: A New Dataset and Benchmark Tasks for AI-Assisted Scientific Workflows. 2373-2394 - Tanawan Premsri, Parisa Kordjamshidi:
Neuro-symbolic Training for Reasoning over Spatial Language. 2395-2414 - Anubrata Das, Manoj Kumar, Ninareh Mehrabi, Anil Ramakrishna, Anna Rumshisky, Kai-Wei Chang, Aram Galstyan, Morteza Ziyadi, Rahul Gupta:
On Localizing and Deleting Toxic Memories in Large Language Models. 2415-2423 - Yifan Liu, Yu Fang, Zhouhan Lin:
DiVISe: Direct Visual-Input Speech Synthesis Preserving Speaker Characteristics And Intelligibility. 2424-2439 - Yuanfu Sun, Zhengnan Ma, Yi Fang, Jing Ma, Qiaoyu Tan:
GraphICL: Unlocking Graph Learning Potential in LLMs through Structured Prompt Design. 2440-2459 - Divyansh Singh, Brodie Mather, Demi Zhang, Patrick Lehman, Justin Ho, Bonnie J. Dorr:
FIDELITY: Fine-grained Interpretable Distillation for Effective Language Insights and Topic Yielding. 2460-2472 - Jiali Chen, Xusen Hei, Yuqi Xue, Zihan Wu, Jiayuan Xie, Yi Cai:
Classic4Children: Adapting Chinese Literary Classics for Children with Large Language Model. 2473-2488 - Juseon-Do, Jaesung Hwang, Jingun Kwon, Hidetaka Kamigaito, Manabu Okumura:
Considering Length Diversity in Retrieval-Augmented Summarization. 2489-2500 - Zhenyue Qin, Yu Yin, Dylan Campbell, Xuansheng Wu, Ke Zou, Ninghao Liu, Yih Chung Tham, Xiuzhen Zhang, Qingyu Chen:
LMOD: A Large Multimodal Ophthalmology Dataset and Benchmark for Large Vision-Language Models. 2501-2522 - Minsang Kim, Seung Jun Baek:
Syntriever: How to Train Your Retriever with Synthetic Data from LLMs. 2523-2539 - Qi Zhang, Huitong Pan, Zhijia Chen, Longin Jan Latecki, Cornelia Caragea, Eduard C. Dragut:
DynClean: Training Dynamics-based Label Cleaning for Distantly-Supervised Named Entity Recognition. 2540-2556 - Andrew Bai, Chih-Kuan Yeh, Cho-Jui Hsieh, Ankur Taly:
An Efficient Rehearsal Scheme for Catastrophic Forgetting Mitigation during Multi-stage Fine-tuning. 2557-2569 - Weiqing Yang, Hanbin Wang, Zhenghao Liu, Xinze Li, Yukun Yan, Shuo Wang, Yu Gu, Minghe Yu, Zhiyuan Liu, Ge Yu:
COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis. 2570-2585 - Zezhong Wang, Xingshan Zeng, Weiwen Liu, Yufei Wang, Liangyou Li, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu, Kam-Fai Wong:
Chain-of-Probe: Examining the Necessity and Accuracy of CoT Step-by-Step. 2586-2606 - Abhishek Kumar Singh, Vishwajeet Kumar, Rudra Murthy, Jaydeep Sen, Ashish R. Mittal, Ganesh Ramakrishnan:
INDIC QA BENCHMARK: A Multilingual Benchmark to Evaluate Question Answering capability of LLMs for Indic Languages. 2607-2626 - Juanhui Li, Sreyashi Nag, Hui Liu, Xianfeng Tang, Sheikh Muhammad Sarwar, Limeng Cui, Hansu Gu, Suhang Wang, Qi He, Jiliang Tang:
Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data. 2627-2641 - Zhaoguang Long, Yuhao Zhou, Shangqing Zhao, Yupei Ren, Li Cai, Chenghao Jia, Zhe Chen, Zhe Fang, Yuxiang Song, Man Lan:
LSDC: An Efficient and Effective Large-Scale Data Compression Method for Supervised Fine-tuning of Large Language Models. 2642-2653 - Yueqi Song, Simran Khanuja, Graham Neubig:
What Is Missing in Multilingual Visual Reasoning and How to Fix It. 2654-2667 - Hui Sun, Rongxin Chen:
Enhancing the Prototype Network with Local-to-Global Optimization for Few-Shot Relation Extraction. 2668-2677 - Xuhan Huang, Qingning Shen, Yan Hu, Anningzhe Gao, Benyou Wang:
LLMs for Mathematical Modeling: Towards Bridging the Gap between Natural and Mathematical Languages. 2678-2710 - Sara Bourbour Hosseinbeigi, Behnam Rohani, Mostafa Masoudi, Mehrnoush Shamsfard, Zahra Saaberi, Mostafa Karimi Manesh, Mohammad Amin Abbasi:
Advancing Persian LLM Evaluation. 2711-2727 - Zile Qiao, Wei Ye, Yong Jiang, Tong Mo, Pengjun Xie, Weiping Li, Fei Huang, Shikun Zhang:
Supportiveness-based Knowledge Rewriting for Retrieval-augmented Language Modeling. 2728-2740 - Jiatao Li, Xinyu Hu, Xunjian Yin, Xiaojun Wan:
Evaluating Self-Generated Documents for Enhancing Retrieval-Augmented Generation with Large Language Models. 2741-2775 - Wei Han, Hui Chen, Soujanya Poria:
PREMISE: Matching-based Prediction for Accurate Review Recommendation. 2776-2794 - Junyu Luo, Xiao Luo, Xiusi Chen, Zhiping Xiao, Wei Ju, Ming Zhang:
Semi-supervised Fine-tuning for Large Language Models. 2795-2808 - Yumeng Wang, Zhiyuan Fan, Qingyun Wang, Yi R. Fung, Heng Ji:
CALM: Unleashing the Cross-Lingual Self-Aligning Ability of Language Model Question Answering. 2809-2817 - Heejin Do, Taehee Park, Sangwon Ryu, Gary Lee:
Towards Prompt Generalization: Grammar-aware Cross-Prompt Automated Essay Scoring. 2818-2824 - Yongqi Fan, Nan Wang, Kui Xue, Jingping Liu, Tong Ruan:
MedEureka: A Medical Domain Benchmark for Multi-Granularity and Multi-Data-Type Embedding-Based Retrieval. 2825-2851 - Jujia Zhao, Wenjie Wang, Chen Xu, See-Kiong Ng, Tat-Seng Chua:
A Federated Framework for LLM-based Recommendation. 2852-2865 - Leyi Pan, Aiwei Liu, Yijian Lu, Zitian Gao, Yichen Di, Lijie Wen, Irwin King, Philip S. Yu:
WaterSeeker: Pioneering Efficient Detection of Watermarked Segments in Large Documents. 2866-2882 - Chanhee Park, Hyeonseok Moon, Chanjun Park, Heuiseok Lim:
MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation. 2883-2900 - Zhuohan Xie, Rui Xing, Yuxia Wang, Jiahui Geng, Hasan Iqbal, Dhruv Sahnan, Iryna Gurevych, Preslav Nakov:
FIRE: Fact-checking with Iterative Retrieval and Verification. 2901-2914 - Eduardo Calò, Lydia Penkert, Saad Mahamood:
Lessons from a User Experience Evaluation of NLP Interfaces. 2915-2929 - Zeyu Zhang, Jianxun Lian, Chen Ma, Yaning Qu, Ye Luo, Lei Wang, Rui Li, Xu Chen, Yankai Lin, Le Wu, Xing Xie, Ji-Rong Wen:
TrendSim: Simulating Trending Topics in Social Media Under Poisoning Attacks with LLM-based Multi-agent System. 2930-2949 - Abdelrahman Abdallah, Jamshid Mozafari, Bhawna Piryani, Adam Jatowt:
ASRank: Zero-Shot Re-Ranking with Answer Scent for Document Retrieval. 2950-2970 - Shaoming Duan, Youxuan Wu, Chuanyi Liu, Yuhao Zhang, Zirui Wang, Peiyi Han, Shengyuan Yu, Liang Yan, Yingwei Liang:
DSQG-Syn: Synthesizing High-quality Data for Text-to-SQL Parsing by Domain Specific Question Generation. 2971-2989 - Junhyeok Kim, Min Soo Kim, Jiwan Chung, Jungbin Cho, Jisoo Kim, Sungwoong Kim, Gyeongbo Sim, Youngjae Yu:
EgoSpeak: Learning When to Speak for Egocentric Conversational Agents in the Wild. 2990-3005 - Chengyue Wu, Zhixuan Liang, Yixiao Ge, Qiushan Guo, Zeyu Lu, Jiahao Wang, Ying Shan, Ping Luo:
Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots. 3006-3028 - Xinping Zhao, Yan Zhong, Zetian Sun, Xinshuo Hu, Zhenyu Liu, Dongfang Li, Baotian Hu, Min Zhang:
FunnelRAG: A Coarse-to-Fine Progressive Retrieval Paradigm for RAG. 3029-3046 - Ikhyun Cho, Changyeon Park, Julia Hockenmaier:
The Power of Bullet Lists: A Simple Yet Effective Prompting Approach to Enhancing Spatial Reasoning in Large Language Models. 3047-3057 - Hai Huang, Sashuai Zhou, Yan Xia:
Overcoming both Domain Shift and Label Shift for Referring Video Segmentation. 3058-3069 - Belinda Z. Li, Emmy Liu, Alexis Ross, Abbas Zeitoun, Graham Neubig, Jacob Andreas:
Language Modeling with Editable External Knowledge. 3070-3090 - Yuyan Bu, Liangyu Huo, Yi Jing, Qing Yang:
Beyond Excess and Deficiency: Adaptive Length Bias Mitigation in Reward Models for RLHF. 3091-3098 - Vishnu Kabir Chhabra, Ding Zhu, Mohammad Mahdi Khalili:
Neuroplasticity and Corruption in Model Mechanisms: A Case Study Of Indirect Object Identification. 3099-3122 - Hanan Gani, Rohit Bharadwaj, Muzammal Naseer, Fahad Shahbaz Khan, Salman Khan:
VANE-Bench: Video Anomaly Evaluation Benchmark for Conversational LMMs. 3123-3140 - Jiachen Ma, Yijiang Li, Zhiqing Xiao, Anda Cao, Jie Zhang, Chao Ye, Junbo Zhao:
Jailbreaking Prompt Attack: A Controllable Adversarial Attack against Diffusion Models. 3141-3157 - Mahshid Dehghani, Amirahmad Shafiee, Ali Shafiei, Neda Fallah, Farahmand Alizadeh, Mohammad Mehdi Gholinejad, Hamid Behroozi, Jafar Habibi, Ehsaneddin Asgari:
Emo3D: Metric and Benchmarking Dataset for 3D Facial Expression Generation from Emotion Description. 3158-3172 - Min Zeng, Haiqin Yang, Xi Chen, Yike Guo:
Task-wrapped Continual Learning in Task-Oriented Dialogue Systems. 3173-3183 - Katerina Korre, Arianna Muti, Federico Ruggeri, Alberto Barrón-Cedeño:
Untangling Hate Speech Definitions: A Semantic Componential Analysis Across Cultures and Domains. 3184-3198 - Zora Zhiruo Wang, Akari Asai, Xinyan Velocity Yu, Frank F. Xu, Yiqing Xie, Graham Neubig, Daniel Fried:
CodeRAG-Bench: Can Retrieval Augment Code Generation? 3199-3214 - Wenjin Tian, Xianying Huang, Shihao Zou:
Multi-Condition Guided Diffusion Network for Multimodal Emotion Recognition in Conversation. 3215-3227 - Samuel Cahyawijaya, Ruochen Zhang, Jan Christian Blaise Cruz, Holy Lovenia, Elisa Gilbert, Hiroki Nomoto, Alham Fikri Aji:
Thank You, Stingray: Multilingual Large Language Models Can Not (Yet) Disambiguate Cross-Lingual Word Senses. 3228-3250 - Yuhao Du, Zhuo Li, Pengyu Cheng, Xiang Wan, Anningzhe Gao:
Atoxia: Red-teaming Large Language Models with Target Toxic Answers. 3251-3266 - Matan Avitan, Ryan Cotterell, Yoav Goldberg, Shauli Ravfogel:
A Practical Method for Generating String Counterfactuals. 3267-3286 - Ingeol Baek, Hwan Chang, Byeongjeong Kim, Jimin Lee, Hwanhee Lee:
Probing-RAG: Self-Probing to Guide Language Models in Selective Document Retrieval. 3287-3304 - Jie Gong, Qiwang Hu:
Extracting Military Event Temporal Relations via Relative Event Time Prediction and Virtual Adversarial Training. 3305-3317 - Wenjun Li, Changyu Chen, Pradeep Varakantham:
Unlocking the Planning Capabilities of Large Language Models with Maximum Diversity Fine-tuning. 3318-3340 - Yixing Li, Ruobing Xie, Xingwu Sun, Yu Cheng, Zhanhui Kang:
Continuous Speech Tokenizer in Text To Speech. 3341-3347 - Owen Cook, Charlie Grimshaw, Ben Peng Wu, Sophie Dillon, Jack Hicks, Luke Jones, Thomas Smith, Matyas Szert, Xingyi Song:
Efficient Annotator Reliability Assessment and Sample Weighting for Knowledge-Based Misinformation Detection on Social Media. 3348-3358 - Wenting Zhao, Alexander M. Rush, Tanya Goyal:
Challenges in Trustworthy Human Evaluation of Chatbots. 3359-3365 - Zhengyuan Zhu, Zeyu Zhang, Haiqi Zhang, Chengkai Li:
RATSD: Retrieval Augmented Truthfulness Stance Detection from Social Media Posts Toward Factual Claims. 3366-3381 - Jinlin Wang, Suyuchen Wang, Ziwen Xia, Sirui Hong, Yun Zhu, Bang Liu, Chenglin Wu:
FACT: Examining the Effectiveness of Iterative Context Rewriting for Multi-fact Retrieval. 3382-3392 - Xingjian Diao, Chunhui Zhang, Weiyi Wu, Zhongyu Ouyang, Peijun Qing, Ming Cheng, Soroush Vosoughi, Jiang Gui:
Temporal Working Memory: Query-Guided Segment Refinement for Enhanced Multimodal Understanding. 3393-3409 - Kyle Wong, Alfonso Amayuelas, Liangming Pan, William Yang Wang:
Investigating the Transferability of Code Repair for Low-Resource Programming Languages. 3410-3432 - Jiayang Song, Yuheng Huang, Zhehua Zhou, Lei Ma:
Multilingual Blending: Large Language Model Safety Alignment Evaluation with Language Mixture. 3433-3449 - Jiarui Wu, Zhuo Liu, Hangfeng He:
Mitigating Hallucinations in Multimodal Spatial Relations through Constraint-Aware Prompting. 3450-3468 - Junjie Liu, Shaotian Yan, Chen Shen, Zhengdong Xiao, Liang Xie, Wenxiao Wang, Jieping Ye:
Concise and Organized Perception Facilitates Reasoning in Large Language Models. 3469-3498 - Zhaoyang Wang, Jinqi Jiang, Huichi Zhou, Wenhao Zheng, Xuchao Zhang, Chetan Bansal, Huaxiu Yao:
Verifiable Format Control for Large Language Model Generations. 3499-3513 - Hwiyeol Jo, Taiwoo Park, Hyunwoo Lee, Nayoung Choi, Changbong Kim, Ohjoon Kwon, Donghyeon Jeon, Eui-Hyeon Lee, Kyoungho Shin, Sun Suk Lim, Kyungmi Kim, Jihye Lee, Sun Kim:
Taxonomy and Analysis of Sensitive User Queries in Generative AI Search System. 3514-3529 - Pengzhou Cheng, Wei Du, Zongru Wu, Fengwei Zhang, Libo Chen, Zhuosheng Zhang, Gongshen Liu:
SynGhost: Invisible and Universal Task-agnostic Backdoor Attack via Syntactic Transfer. 3530-3546 - Wenhan Wang, Chenyuan Yang, Zhijie Wang, Yuheng Huang, Zhaoyang Chu, Da Song, Lingming Zhang, An Ran Chen, Lei Ma:
TESTEVAL: Benchmarking Large Language Models for Test Case Generation. 3547-3562 - Siyin Wang, Xingsong Ye, Qinyuan Cheng, Junwen Duan, Shimin Li, Jinlan Fu, Xipeng Qiu, Xuanjing Huang:
Safe Inputs but Unsafe Output: Benchmarking Cross-modality Safety Alignment of Large Vision-Language Models. 3563-3605 - Dahyun Jung, Seungyoon Lee, Hyeonseok Moon, Chanjun Park, Heuiseok Lim:
FLEX: A Benchmark for Evaluating Robustness of Fairness in Large Language Models. 3606-3620 - Shufan Chen, He Zheng, Lei Cui:
When and How to Augment Your Input: Question Routing Helps Balance the Accuracy and Efficiency of Large Language Models. 3621-3634 - Ziwen Li, Xiang Chen, Youngseung Jeon:
GraPPI: A Retrieve-Divide-Solve GraphRAG Framework for Large-scale Protein-protein Interaction Exploration. 3635-3648 - Geonyeong Son, Jaeyoung Lee, Misuk Kim:
From Curiosity to Clarity : Exploring the Impact of Consecutive Why-Questions. 3649-3664 - Saranya Venkatraman, Nafis Irtiza Tripto, Dongwon Lee:
CollabStory: Multi-LLM Collaborative Story Generation and Authorship Analysis. 3665-3679 - Pranshu Pandya, Vatsal Gupta, Agney S. Talwarr, Tushar Kataria, Dan Roth, Vivek Gupta:
NTSEBENCH: Cognitive Reasoning Benchmark for Vision Language Models. 3680-3708 - Yuqi Zhu, Shuofei Qiao, Yixin Ou, Shumin Deng, Shiwei Lyu, Yue Shen, Lei Liang, Jinjie Gu, Huajun Chen, Ningyu Zhang:
KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents. 3709-3732 - Jahyun Koo, Yerin Hwang, Yongil Kim, Taegwan Kang, Hyunkyung Bae, Kyomin Jung:
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models. 3733-3746 - Abhijit Mishra, Shreya Shukla, Jose Torres, Jacek Gwizdka, Shounak Roychowdhury:
Thought2Text: Text Generation from EEG Signal using Large Language Models (LLMs). 3747-3759 - Zhiqiang Shi, Ruchit Agrawal:
A Comprehensive Survey of Contemporary Arabic Sentiment Analysis: Methods, Challenges, and Future Directions. 3760-3772 - Shintaro Ozaki, Kazuki Hayashi, Yusuke Sakai, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe:
Towards Cross-Lingual Explanation of Artwork in Large-scale Vision Language Models. 3773-3809 - Yiyi Chen, Qiongxiu Li, Russa Biswas, Johannes Bjerva:
Large Language Models are Easily Confused: A Quantitative Metric, Security Implications and Typological Analysis. 3810-3827 - Xidong Wang, Jianquan Li, Shunian Chen, Yuxuan Zhu, Xiangbo Wu, Zhiyi Zhang, Xiaolong Xu, Junying Chen, Jie Fu, Xiang Wan, Anningzhe Gao, Benyou Wang:
Huatuo-26M, a Large-scale Chinese Medical QA Dataset. 3828-3848 - Han Liu, Shuqin Li, Xiaotong Zhang, Yuanyuan Wang, Feng Zhang, Hongyang Chen, Hong Yu:
SEP-MLDC: A Simple and Effective Paradigm for Multi-Label Document Classification. 3849-3859 - Qi Zhao, Qi Song, Tian Xie, Haiyue Zhang, Hongyu Yang, Xiangyang Li:
Improving Pre-trained Language Models with Knowledge Enhancement and Filtering Framework. 3860-3871 - Jiazhou Chen, Xu Jia, Ruiqiang Guo:
Using Review Combination and Pseudo-Tokens for Aspect Sentiment Quad Prediction. 3872-3883 - Chentao Huang, Guangli Li, Xinjiong Zhou, Yafeng Ren, Hongbin Zhang:
DDGIP: Radiology Report Generation Through Disease Description Graph and Informed Prompting. 3884-3894 - Sukmin Cho, Sangjin Choi, Taeho Hwang, Jeongyeon Seo, Soyeong Jeong, Huije Lee, Hoyun Song, Jong C. Park, Youngjin Kwon:
Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding. 3895-3911 - Jialiang Wu, Yi Shen, Sijia Liu, Yi Tang, Sen Song, Xiaoyi Wang, Longjun Cai:
Improve Decoding Factuality by Token-wise Cross Layer Entropy of Large Language Models. 3912-3921 - Zhaopeng Feng, Yan Zhang, Hao Li, Bei Wu, Jiayu Liao, Wenqiang Liu, Jun Lang, Yang Feng, Jian Wu, Zuozhu Liu:
TEaR: Improving LLM-based Machine Translation with Systematic Self-Refinement. 3922-3938 - Yiwei Wang, Muhao Chen, Nanyun Peng, Kai-Wei Chang:
Vulnerability of Large Language Models to Output Prefix Jailbreaks: Impact of Positions on Safety. 3939-3952 - Yuan Tian, Minzheng Wang, Nan Xu, Wenji Mao:
ImaRA: An Imaginative Frame Augmented Method for Low-Resource Multimodal Metaphor Detection and Explanation. 3953-3967 - Peiqin Lin, André F. T. Martins, Hinrich Schütze:
XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples. 3968-3977 - Haoyi Qiu, Alexander R. Fabbri, Divyansh Agarwal, Kung-Hsiang Huang, Sarah Tan, Nanyun Peng, Chien-Sheng Wu:
Evaluating Cultural and Social Awareness of LLM Web Agents. 3978-4005 - Runchuan Zhu, Xinke Jiang, Jiang Wu, Zhipeng Ma, Jiahe Song, Fengshuo Bai, Dahua Lin, Lijun Wu, Conghui He:
GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation. 4006-4021 - Fu Zhang, Hongsen Yu, Jingwei Cheng, Huangming Xu:
Entity Pair-guided Relation Summarization and Retrieval in LLMs for Document-level Relation Extraction. 4022-4037 - Peiqin Lin, André F. T. Martins, Hinrich Schütze:
A Recipe of Parallel Corpora Exploitation for Multilingual Large Language Models. 4038-4050 - Shulei Wang, Shuai Yang, Wang Lin, Zirun Guo, Sihang Cai, Hai Huang, Ye Wang, Jingyuan Chen, Tao Jin:
Omni-Chart-600K: A Comprehensive Dataset of Chart Types for Chart Understanding. 4051-4069 - Yassine El Kheir, Younes Samih, Suraj Maharjan, Tim Polzehl, Sebastian Möller:
Comprehensive Layer-wise Analysis of SSL Models for Audio Deepfake Detection. 4070-4082 - Iuliia Zaitova, Vitalii Hirak, Badr M. Abdullah, Dietrich Klakow, Bernd Möbius, Tania Avgustinova:
Attention on Multiword Expressions: A Multilingual Study of BERT-based Models with Regard to Idiomaticity and Microsyntax. 4083-4092 - Jiwei Tang, Jin Xu, Tingwei Lu, Zhicheng Zhang, Yiming Zhao, Lin Hai, Hai-Tao Zheng:
Perception Compressor: A Training-Free Prompt Compression Framework in Long Context Scenarios. 4093-4108 - Nishat Raihan, Joanna C. S. Santos, Marcos Zampieri:
MojoBench: Language Modeling and Benchmarks for Mojo. 4109-4128 - Kang-il Lee, Minbeom Kim, Seunghyun Yoon, Minsung Kim, Dongryeol Lee, Hyukhun Koh, Kyomin Jung:
VLind-Bench: Measuring Language Priors in Large Vision-Language Models. 4129-4144 - Yuntong Hu, Zhihan Lei, Zheng Zhang, Bo Pan, Chen Ling, Liang Zhao:
GRAG: Graph Retrieval-Augmented Generation. 4145-4157 - Zhili Feng, Dhananjay Ram, Cole Hawkins, Aditya Rawal, Jinman Zhao, Sheng Zha:
Sequence-level Large Language Model Training with Contrastive Preference Optimization. 4158-4164 - Haritz Puerto, Martin Gubri, Sangdoo Yun, Seong Joon Oh:
Scaling Up Membership Inference: When and How Attacks Succeed on Large Language Models. 4165-4182 - Kyungmin Min, Minbeom Kim, Kang-il Lee, Dongryeol Lee, Kyomin Jung:
Mitigating Hallucinations in Large Vision-Language Models via Summary-Guided Decoding. 4183-4198 - Xiaoyi Bao, Minjie Qiang, Jinghang Gu, Zhongqing Wang, Chu-Ren Huang:
Exploring Hybrid Sampling Inference for Aspect-based Sentiment Analysis. 4199-4210 - Jeonghyun Ko, Gyeongyun Park, Donghoon Lee, Kyunam Lee:
FeRG-LLM : Feature Engineering by Reason Generation Large Language Models. 4211-4228 - Abdellah El Mekki, Muhammad Abdul-Mageed:
Effective Self-Mining of In-Context Examples for Unsupervised Machine Translation with LLMs. 4229-4256 - Shuhe Wang, Xiaofei Sun, Xiaoya Li, Rongbin Ouyang, Fei Wu, Tianwei Zhang, Jiwei Li, Guoyin Wang, Chen Guo:
GPT-NER: Named Entity Recognition via Large Language Models. 4257-4275 - Changhai Zhou, Yuhua Zhou, Yibin Wang, Shijie Han, Qian Qiao, Hongguang Li:
QPruner: Probabilistic Decision Quantization for Structured Pruning in Large Language Models. 4276-4286 - Pingyu Wu, Daiheng Gao, Jin Tang, Huimin Chen, Wenbo Zhou, Weiming Zhang, Nenghai Yu:
MES-RAG: Bringing Multi-modal, Entity-Storage, and Secure Enhancements to RAG. 4287-4298 - Yizheng Sun, Yanze Xin, Hao Li, Jingyuan Sun, Chenghua Lin, Riza Batista-Navarro:
LVPruning: An Effective yet Simple Language-Guided Vision Token Pruning Approach for Multi-modal Large Language Models. 4299-4308 - Sergey Pletenev, Maria Marina, Daniil Moskovskiy, Vasily Konovalov, Pavel Braslavski, Alexander Panchenko, Mikhail Salnikov:
How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM? 4309-4322 - Xinyuan Lu, Liangming Pan, Yubo Ma, Preslav Nakov, Min-Yen Kan:
TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning. 4323-4339 - Zhihui Shao, Shubin Cai, Rongsheng Lin, Zhong Ming:
Enhancing Text-to-SQL with Question Classification and Multi-Agent Collaboration. 4340-4349 - Wataru Hashimoto, Hidetaka Kamigaito, Taro Watanabe:
Efficient Nearest Neighbor based Uncertainty Estimation for Natural Language Processing Tasks. 4350-4366 - Hanyong Lee, Chaelyn Lee, Yongjae Lee, Jaesung Lee:
BitAbuse: A Dataset of Visually Perturbed Texts for Defending Phishing Attacks. 4367-4384 - Weiqi Wu, Shen Huang, Yong Jiang, Pengjun Xie, Fei Huang, Hai Zhao:
Unfolding the Headline: Iterative Self-Questioning for News Retrieval and Timeline Summarization. 4385-4398 - Chuwen Chen, Shuai Zhang:
RetrieverGuard: Empowering Information Retrieval to Combat LLM-Generated Misinformation. 4399-4411 - Seungwoo Song, Junghun Yuk, ChangSu Choi, Hangyeol Yoo, HyeonSeok Lim, KyungTae Lim, Jungyeul Park:
Unified Automated Essay Scoring and Grammatical Error Correction. 4412-4426 - Ka Man Lo, Zeyu Huang, Zihan Qiu, Zili Wang, Jie Fu:
A Closer Look into Mixture-of-Experts in Large Language Models. 4427-4447 - Tulio Ferreira Leite Da Silva, Gonzalo Freijedo Aduna, Farah Benamara, Alda Mari, Zongmin Li, Li Yue, Jian Su:
CDB: A Unified Framework for Hope Speech Detection Through Counterfactual, Desire and Belief. 4448-4463 - Jiyue Jiang, Pengan Chen, Liheng Chen, Sheng Wang, Qinghang Bao, Lingpeng Kong, Yu Li, Chuan Wu:
How Well Do LLMs Handle Cantonese? Benchmarking Cantonese Capabilities of Large Language Models. 4464-4505 - Zihuiwen Ye, Fraser Greenlee-Scott, Max Bartolo, Phil Blunsom, Jon Ander Campos, Matthias Gallé:
Improving Reward Models with Synthetic Critiques. 4506-4520 - Yuanyi Wang, Han Li, Haifeng Sun, Lei Zhang, Bo He, Wei Tang, Tianhao Yan, Qi Qi, Jingyu Wang:
Rethinking Smoothness for Fast and Adaptable Entity Alignment Decoding. 4521-4535 - Meiyun Wang, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo:
Lost in the Distance: Large Language Models Struggle to Capture Long-Distance Relational Knowledge. 4536-4544 - Jabez Magomere, Elena Kochkina, Samuel Mensah, Simerjot Kaur, Charese Smiley:
FinNLI: Novel Dataset for Multi-Genre Financial Natural Language Inference Benchmarking. 4545-4568 - Atharva Mehta, Shivam Chauhan, Amirbek Djanibekov, Atharva Kulkarni, Gus Xia, Monojit Choudhury:
Music for All: Representational Bias and Cross-Cultural Adaptability of Music Generation Models. 4569-4585 - Zhijie Bao, Qingyun Liu, Xuanjing Huang, Zhongyu Wei:
SFMSS: Service Flow aware Medical Scenario Simulation for Conversational Data Generation. 4586-4604 - Mingqi Gao, Yixin Liu, Xinyu Hu, Xiaojun Wan, Jonathan Bragg, Arman Cohan:
Re-evaluating Automatic LLM System Ranking for Alignment with Human Preference. 4605-4629 - Priya Mishra, Suraj Racha, Kaustubh Ponkshe, Adit Akarsh, Ganesh Ramakrishnan:
GuideQ: Framework for Guided Questioning for progressive informational collection and classification. 4630-4644 - Kirti Bhagat, Kinshuk Vasisht, Danish Pruthi:
Richer Output for Richer Countries: Uncovering Geographical Disparities in Generated Stories and Travel Recommendations. 4645-4653 - Gagan Bhatia, El Moatez Billah Nagoudi, Abdellah El Mekki, Fakhraddin Alwajih, Muhammad Abdul-Mageed:
Swan and ArabicMTEB: Dialect-Aware, Arabic-Centric, Cross-Lingual, and Cross-Cultural Embedding Models and Benchmarks. 4654-4670 - Jipeng Zhang, Yaxuan Qin, Renjie Pi, Weizhong Zhang, Rui Pan, Tong Zhang:
TAGCOS: Task-agnostic Gradient Clustered Coreset Selection for Instruction Tuning Data. 4671-4686 - Navya Jain, Zekun Wu, Cristian E. Muñoz Villalobos, Airlie Hilliard, Xin Guan, Adriano S. Koshiyama, Emre Kazim, Philip C. Treleaven:
From Text to Emoji: How PEFT-Driven Personality Manipulation Unleashes the Emoji Potential in LLMs. 4687-4723 - Jane Warren, Gary M. Weiss, Fernando Martinez, Annika Guo, Yijun Zhao:
Decoding Fatphobia: Examining Anti-Fat and Pro-Thin Bias in AI-Generated Images. 4724-4736 - Guoli Yin, Haoping Bai, Shuang Ma, Feng Nan, Yanchao Sun, Zhaoyang Xu, Shen Ma, Jiarui Lu, Xiang Kong, Aonan Zhang, Dian Ang Yap, Yizhe Zhang, Karsten Ahnert, Vik Kamath, Mathias Berglund, Dominic Walsh, Tobias Gindele, Juergen Wiest, Zhengfeng Lai, Xiaoming Wang, Jiulong Shan, Meng Cao, Ruoming Pang, Zirui Wang:
MMAU: A Holistic Benchmark of Agent Capabilities Across Diverse Domains. 4737-4765 - Ashutosh Sathe, Divyanshu Aggarwal, Sunayana Sitaram:
Improving Consistency in LLM Inference using Probabilistic Tokenization. 4766-4778 - Tianrong Zhang, Bochuan Cao, Yuanpu Cao, Lu Lin, Prasenjit Mitra, Jinghui Chen:
WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response. 4779-4807 - Swanand Vaishampayan, Hunter Leary, Yoseph Berhanu Alebachew, Louis Hickman, Brent A Stevenor, Weston Beck, Chris Brown:
Human and LLM-Based Resume Matching: An Observational Study. 4808-4823 - Brian Tufts, Xuandong Zhao, Lei Li:
A Practical Examination of AI-Generated Text Detectors for Large Language Models. 4824-4841 - Ingroj Shrestha, Louis Tay, Padmini Srinivasan:
Robust Bias Detection in MLMs and its Application to Human Trait Ratings. 4842-4858 - Michael Galarnyk, Agam Shah, Dipanwita Guhathakurta, Poojitha Nandigam, Sudheer Chava:
How Inclusively do LMs Perceive Social and Moral Norms? 4859-4869 - Yu-Ling Hsu, Hsuan Su, Shang-Tse Chen:
Jailbreaking with Universal Multi-Prompts. 4870-4891 - Xiaoying Song, Sharon Lisseth Perez, Xinchen Yu, Eduardo Blanco, Lingzi Hong:
Echoes of Discord: Forecasting Hater Reactions to Counterspeech. 4892-4905 - Athiya Deviyani, Fernando Diaz:
Contextual Metric Meta-Evaluation by Measuring Local Metric Accuracy. 4906-4925 - Thennal D. K, Jesin James, Deepa P. Gopinath, Muhammed Ashraf K:
Advocating Character Error Rate for Multilingual ASR Evaluation. 4926-4935 - Irwin Deng, Kushagra Dixit, Dan Roth, Vivek Gupta:
Enhancing Temporal Understanding in LLMs for Semi-structured Tables. 4936-4955 - Mohammad Jahid Ibna Basher, Md. Kowsher, Md Saiful Islam, Rabindra Nath Nandi, Nusrat Jahan Prottasha, Mehadi Hasan Menon, Tareq Al Muntasir, Shammur Absar Chowdhury, Firoj Alam, Niloofar Yousefi, Ozlem O. Garibay:
BnTTS: Few-Shot Speaker Adaptation in Low-Resource Setting. 4956-4968 - Lian Remme, Kevin Tang:
Playing with Voices: Tabletop Role-Playing Game Recordings as a Diarization Challenge. 4969-4983 - Yuen Chen, Vethavikashini Chithrra Raghuram, Justus Mattern, Rada Mihalcea, Zhijing Jin:
Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias. 4984-5004 - Yuling Gu, Oyvind Tafjord, Bailey Kuehl, Dany Haddad, Jesse Dodge, Hannaneh Hajishirzi:
OLMES: A Standard for Language Model Evaluations. 5005-5033 - Joy Crosbie, Ekaterina Shutova:
Induction Heads as an Essential Mechanism for Pattern Matching in In-context Learning. 5034-5096 - Chongyang Gao, Kezhen Chen, Jinmeng Rao, Ruibo Liu, Baochen Sun, Yawen Zhang, Daiyi Peng, Xiaoyuan Guo, V. S. Subrahmanian:
MoLA: MoE LoRA with Layer-wise Expert Allocation. 5097-5112 - Md. Ashraful Islam, Mohammed Eunus Ali, Md. Rizwan Parvez:
CodeSim: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging. 5113-5139 - Cathy Jiao, Weizhen Gao, Aditi Raghunathan, Chenyan Xiong:
On the Feasibility of In-Context Probing for Data Attribution. 5140-5155 - Gonçalo Gomes, Chrysoula Zerva, Bruno Martins:
Evaluation of Multilingual Image Captioning: How far can we get with CLIP models? 5156-5175 - Guangyao Dou, Zheyuan Liu, Qing Lyu, Kaize Ding, Eric Wong:
Avoiding Copyright Infringement via Large Language Model Unlearning. 5176-5200 - Xuanyu Su, Yansong Li, Diana Inkpen, Nathalie Japkowicz:
A Context-Aware Contrastive Learning Framework for Hateful Meme Detection and Segmentation. 5201-5215 - Jie S. Li, Jonas Geiping, Micah Goldblum, Aniruddha Saha, Tom Goldstein:
LLM-Generated Passphrases That Are Secure and Easy to Remember. 5216-5234 - Yujuan Fu, Özlem Uzuner, Meliha Yetisgen, Fei Xia:
Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions. 5235-5256 - Deokgi Kim, Joonyoung Jo, Byung-Won On, Ingyu Lee:
Representation-to-Creativity (R2C): Automated Holistic Scoring Model for Essay Creativity. 5257-5275 - Catarina G. Belém, Pouya Pezeshkpour, Hayate Iso, Seiji Maekawa, Nikita Bhutani, Estevam Hruschka:
From Single to Multi: How LLMs Hallucinate in Multi-Document Summarization. 5276-5309 - Fei Wang, Chao Shang, Shuai Wang, Sarthak Jain, Qiang Ning, Bonan Min, Vittorio Castelli, Yassine Benajiba, Dan Roth:
Aligning to Constraints for Data-Efficient Language Model Customization. 5310-5325 - Armineh Nourbakhsh, Siddharth Parekh, Pranav Shetty, Zhao Jin, Sameena Shah, Carolyn P. Rosé:
Where is this coming from? Making groundedness count in the evaluation of Document VQA models. 5326-5346 - Xinbo Wu, Lav R. Varshney:
Transformer-based Causal Language Models Perform Clustering. 5347-5372 - Zaifu Zhan, Rui Zhang:
Towards Better Multi-task Learning: A Framework for Optimizing Dataset Combinations in Large Language Models. 5373-5386 - Chun-Yi Kuan, Hung-yi Lee:
Gender Bias in Instruction-Guided Speech Synthesis Models. 5387-5413 - Zeao Tu, Xiangdi Meng, Yu He, Zihan Yao, Tianyu Qi, Jun Liu, Ming Li:
ResoFilter: Fine-grained Synthetic Data Filtering for Large Language Models through Data-Parameter Resonance Analysis. 5414-5428 - Yuzhe Yang, Yifei Zhang, Yan Hu, Yilin Guo, Ruoli Gan, Yueru He, Mingcong Lei, Xiao Zhang, Haining Wang, Qianqian Xie, Jimin Huang, Honghai Yu, Benyou Wang:
UCFE: A User-Centric Financial Expertise Benchmark for Large Language Models. 5429-5448 - Yuankai Li, Jia-Chen Gu, Di Wu, Kai-Wei Chang, Nanyun Peng:
BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression. 5449-5470 - Weipeng Jiang, Zhenting Wang, Juan Zhai, Shiqing Ma, Zhengyu Zhao, Chao Shen:
An Optimizable Suffix Is Worth A Thousand Templates: Efficient Black-box Jailbreaking without Affirmative Phrases via LLM as Optimizer. 5471-5483 - Changhao Guan, Chao Huang, Hongliang Li, You Li, Ning Cheng, Zihe Liu, Yufeng Chen, Jinan Xu, Jian Liu:
Multi-Stage LLM Fine-Tuning with a Continual Learning Setting. 5484-5498 - Hao-Xiang Xu, Jun-Yu Ma, Zhen-Hua Ling, Ningyu Zhang, Jia-Chen Gu:
Constraining Sequential Model Editing with Editing Anchor Compression. 5499-5515 - Zayd Muhammad Kawakibi Zuhri, Muhammad Farid Adilazuarda, Ayu Purwarianti, Alham Fikri Aji:
MLKV: Multi-Layer Key-Value Heads for Memory Efficient Transformer Decoding. 5516-5525 - Michael J. Q. Zhang, Eunsol Choi:
Clarify When Necessary: Resolving Ambiguity Through Interaction with LMs. 5526-5543 - Mariam Nakhlé, Marco Dinarelli, Raheel Qader, Emmanuelle Esperança-Rodier, Hervé Blanchon:
DOLFIN - Document-Level Financial Test-Set for Machine Translation. 5544-5556 - Nafis Neehal, Bowen Wang, Shayom Debopadhaya, Corey Curran, Keerthiram Murugesan, Soham Dan, Vibha Anand, Kristin P. Bennett:
Are Large Language Models Effective in Clinical Trial Design? A Study on Baseline Feature Generation. 5557-5570 - Qianren Mao, Weifeng Jiang, Junnan Liu, Chenghua Lin, Qian Li, Xianqing Wen, Jianxin Li, Jinhu Lu:
Lightweight Contenders: Navigating Semi-Supervised Text Mining through Peer Collaboration and Self Transcendence. 5571-5585 - Young Min Cho, Dandan Pang, Stuti Thapa, Garrick Sherman, Lyle H. Ungar, Louis Tay, Sharath Chandra Guntuku:
Language-based Valence and Arousal Expressions between the United States and China: a Cross-Cultural Examination. 5586-5600 - Juntae Lee, Jihwan Bang, Kyuhong Shim, Seunghan Yang, Simyung Chang:
Chain-of-Rank: Enhancing Large Language Models for Domain-Specific RAG in Edge Device. 5601-5608 - Xujia Wang, Haiyan Zhao, Shuo Wang, Hanqing Wang, Zhiyuan Liu:
MALoRA: Mixture of Asymmetric Low-Rank Adaptation for Enhanced Multi-Task Learning. 5609-5626 - Mohamed Bayan Kmainasi, Ali Ezzat Shahroor, Maram Hasanain, Sahinur Rahman Laskar, Naeemul Hassan, Firoj Alam:
LlamaLens: Specialized Multilingual LLM for Analyzing News and Social Media Content. 5627-5649 - Iain Weissburg, Sathvika Anand, Sharon Levy, Haewon Jeong:
LLMs are Biased Teachers: Evaluating LLM Bias in Personalized Education. 5650-5698 - Si-An Chen, Hsuan-Tien Lin, Chih-Jen Lin:
Preserving Zero-shot Capability in Supervised Fine-tuning for Multi-label Text Classification. 5699-5712 - Zhenting Wang, Zhizhi Wang, Mingyu Jin, Mengnan Du, Juan Zhai, Shiqing Ma:
Data-centric NLP Backdoor Defense from the Lens of Memorization. 5713-5731 - Sen Yang, Xin Li, Leyang Cui, Lidong Bing, Wai Lam:
Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs. 5732-5744 - Revanth Gangi Reddy, Sagnik Mukherjee, Jeonghwan Kim, Zhenhailong Wang, Dilek Hakkani-Tür, Heng Ji:
Infogent: An Agent-Based Framework for Web Information Aggregation. 5745-5758 - Nilmadhab Das, Vijaya V. Saradhi, Ashish Anand:
On the Role of Key Phrases in Argument Mining. 5759-5772 - Somraj Gautam, Abhishek Bhandari, Gaurav Harit:
TabComp: A Dataset for Visual Table Reading Comprehension. 5773-5780 - Changhai Zhou, Shijie Han, Lining Yang, Yuhua Zhou, Xu Cheng, Yibin Wang, Hongguang Li:
RankAdaptor: Hierarchical Rank Allocation for Efficient Fine-Tuning Pruned LLMs via Performance Model. 5781-5795 - SeongYeub Chu, Jong Woo Kim, Bryan Wong, Mun Yong Yi:
Rationale Behind Essay Scores: Enhancing S-LLM's Multi-Trait Essay Scoring with Rationale Generated by LLMs. 5796-5814 - Wanqi Yang, Yanda Li, Meng Fang, Ling Chen:
MTPChat: A Multimodal Time-Aware Persona Dataset for Conversational Agents. 5815-5826 - Mozhi Zhang, Pengyu Wang, Chenkun Tan, Mianqiu Huang, Dong Zhang, Yaqian Zhou, Xipeng Qiu:
MetaAlign: Align Large Language Models with Diverse Preferences during Inference Time. 5827-5845 - Yongjin Yang, Haneul Yoo, Hwaran Lee:
MAQA: Evaluating Uncertainty Quantification in LLMs Regarding Data Uncertainty. 5846-5863 - Hyundong Justin Cho, Karishma Sharma, Nicolaas Paul Jedema, Leonardo F. R. Ribeiro, Jonathan May, Alessandro Moschitti:
Tuning-Free Personalized Alignment via Trial-Error-Explain In-Context Learning. 5864-5885 - Jing Ma:
Causal Inference with Large Language Model: A Survey. 5886-5898 - Chanwoong Yoon, Gangwoo Kim, Byeongguk Jeon, Sungdong Kim, Yohan Jo, Jaewoo Kang:
Ask Optimal Questions: Aligning Large Language Models with Retriever's Preference in Conversation. 5899-5921 - Kushagra Bhushan, Yatin Nandwani, Dinesh Khandelwal, Sonam Gupta, Gaurav Pandey, Dinesh Raghu, Sachindra Joshi:
Systematic Knowledge Injection into Large Language Models via Diverse Augmentation for Domain-Specific RAG. 5922-5943 - Hyeonseok Moon, Jaehyung Seo, Seungyoon Lee, Chanjun Park, Heuiseok Lim:
Find the Intention of Instruction: Comprehensive Evaluation of Instruction Understanding for Large Language Models. 5944-5964 - Yuto Nishida, Makoto Morishita, Hiroyuki Deguchi, Hidetaka Kamigaito, Taro Watanabe:
Long-Tail Crisis in Nearest Neighbor Language Models. 5965-5978 - Gal Yona, Or Honovich, Omer Levy, Roee Aharoni:
Keep Guessing? When Considering Inference Scaling, Mind the Baselines. 5979-5991 - Ruiyao Xu, Kaize Ding:
Large Language Models for Anomaly and Out-of-Distribution Detection: A Survey. 5992-6012 - QianyiHu QianyiHu, Xinhui Tu, Guo Cong, Shunping Zhang:
Time-aware ReAct Agent for Temporal Knowledge Graph Question Answering. 6013-6024 - Xiaochen Wang, Junqing He, Liang Chen, Gholamreza Haffari, Yiru Wang, Zhe Yang, Xiangdi Meng, Kunhao Pan, Zhifang Sui:
SG-FSM: A Self-Guiding Zero-Shot Prompting Paradigm for Multi-Hop Question Answering Based on Finite State Machine. 6025-6037 - Tanmay Parekh, Pradyot Prakash, Alexander Radovic, Akshay Shekher, Denis Savenkov:
Dynamic Strategy Planning for Efficient Question Answering with Large Language Models. 6038-6059 - Hamidreza Saffari, Mohammadamin Shafiei, Donya Rooein, Francesco Pierri, Debora Nozza:
Can I Introduce My Boyfriend to My Grandmother? Evaluating Large Language Models Capabilities on Iranian Social Norm Classification. 6060-6074 - Shwetha Somasundaram, Anirudh Phukan, Apoorv Saxena:
PLD+: Accelerating LLM Inference by Leveraging Language Model Artifacts. 6075-6089 - Kuan Wang, Yadong Lu, Michael Santacroce, Yeyun Gong, Chao Zhang, Yelong Shen:
Adapting LLM Agents with Universal Communication Feedback. 6090-6107 - Jean Vassoyan, Nathanaël Beau, Roman Plaud:
Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning. 6108-6118 - Chaoqun Liu, Wenxuan Zhang, Jiahao Ying, Mahani Aljunied, Anh Tuan Luu, Lidong Bing:
SeaExam and SeaBench: Benchmarking LLMs with Local Multilingual Questions in Southeast Asia. 6119-6136 - Xiang Gao, Ankita Sinha, Kamalika Das:
Learning to Search Effective Example Sequences for In-Context Learning. 6137-6146 - Harsh Nishant Lalai, Aashish Anantha Ramakrishnan, Raj Sanjay Shah, Dongwon Lee:
From Intentions to Techniques: A Comprehensive Taxonomy and Challenges in Text Watermarking for Large Language Models. 6147-6160 - Antoine Dussolle, Andrea Cardeña Díaz, Shota Sato, Peter Devine:
M-IFEval: Multilingual Instruction-Following Evaluation. 6161-6176 - Zhiqiang Zhong, Simon Sataa-Yu Larsen, Haoyu Guo, Tao Tang, Kuangyu Zhou, Davide Mottin:
Automatic Annotation Augmentation Boosts Translation between Molecules and Natural Language. 6177-6194 - Guoliang Zhu, Tao Ren, Dandan Wang, Jun Hu:
Let Modalities Teach Each Other: Modal-Collaborative Knowledge Extraction and Fusion for Multimodal Knowledge Graph Completion. 6195-6207 - Ondrej Sotolár, Michal Tkaczyk, Jaromír Plhák, David Smahel:
Modeling the Differential Prevalence of Online Supportive Interactions in Private Instant Messages of Adolescents. 6208-6226 - Ruiquan Zhang, Rui Zhao, Zhicong Wu, Liang Zhang, Haoqi Zhang, Yidong Chen:
Dynamic Feature Fusion for Sign Language Translation Using HyperNetworks. 6227-6239 - Sonam Gupta, Yatin Nandwani, Asaf Yehudai, Dinesh Khandelwal, Dinesh Raghu, Sachindra Joshi:
Selective Self-to-Supervised Fine-Tuning for Generalization in Large Language Models. 6240-6249 - Israel Abebe Azime, Atnafu Lambebo Tonja, Tadesse Destaw Belay, Yonas Chanie, Bontu Fufa Balcha, Negasi Haile Abadi, Henok Biadglign Ademtew, Mulubrhan Abebe Nerea, Debela Desalegn Yadeta, Derartu Dagne Geremew, Assefa Atsbiha tesfau, Philipp Slusallek, Thamar Solorio, Dietrich Klakow:
ProverbEval: Exploring LLM Evaluation Challenges for Low-resource Language Understanding. 6250-6266 - Shizhou Huang, Bo Xu, Changqun Li, Yang Yu, Xin Alex Lin:
MRE-MI: A Multi-image Dataset for Multimodal Relation Extraction in Social Media Posts. 6267-6277 - Do Huu Dat, Duc Anh Do, Anh Tuan Luu, Wray L. Buntine:
Discrete Diffusion Language Model for Efficient Text Summarization. 6278-6290 - June M. Liu, He Cao, Renliang Sun, Rui Wang, Yu Li, Jiaxing Zhang:
CAPE: A Chinese Dataset for Appraisal-based Emotional Generation in Large Language Models. 6291-6309 - Hongbang Yuan, Yubo Chen, Pengfei Cao, Zhuoran Jin, Kang Liu:
Beyond Under-Alignment: Atomic Preference Enhanced Factuality Tuning for Large Language Models. 6310-6323 - Go Kamoda, Benjamin Heinzerling, Tatsuro Inaba, Keito Kudo, Keisuke Sakaguchi, Kentaro Inui:
Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference. 6324-6343 - Hoang Anh Just, Mahavir Dabas, Lifu Huang, Ming Jin, Ruoxi Jia:
DiPT: Enhancing LLM Reasoning through Diversified Perspective-Taking. 6344-6374 - Arian Askari, Roxana Petcu, Chuan Meng, Mohammad Aliannejadi, Amin Abolghasemi, Evangelos Kanoulas, Suzan Verberne:
SOLID: Self-seeding and Multi-intent Self-instructing LLMs for Generating Intent-aware Information-Seeking Dialogs. 6375-6395 - Siyu Xu, Yunke Wang, Daochang Liu, Bo Du, Chang Xu:
CollagePrompt: A Benchmark for Budget-Friendly Visual Recognition with GPT-4V. 6396-6418 - Yaswanth M, Vaibhav Singh, Ayush Maheshwari, Amrith Krishna, Ganesh Ramakrishnan:
ARISE: Iterative Rule Induction and Synthetic Data Generation for Text Classification. 6419-6434 - Sangwon Yu, Ik-hwan Kim, Jongyoon Song, Saehyung Lee, Junsung Park, Sungroh Yoon:
Unleashing Multi-Hop Reasoning Potential in Large Language Models through Repetition of Misordered Context. 6435-6455 - Angelina Parfenova, Andreas Marfurt, Jürgen Pfeffer, Alexander Denzler:
Text Annotation via Inductive Coding: Comparing Human Experts to LLMs in Qualitative Data Analysis. 6456-6469 - Peng Cui, Mrinmaya Sachan:
Investigating the Zone of Proximal Development of Language Models for In-Context Learning. 6470-6483 - Itay Nakash, George Kour, Guy Uziel, Ateret Anaby-Tavor:
Breaking ReAct Agents: Foot-in-the-Door Attack Will Get You In. 6484-6509 - Pietro Tropeano, Maria Maistro, Tuukka Ruotsalo, Christina Lioma:
As easy as PIE: understanding when pruning causes language models to disagree. 6510-6536 - ShengbinYue ShengbinYue, Ting Huang, Zheng Jia, Siyuan Wang, Shujun Liu, Yun Song, Xuanjing Huang, Zhongyu Wei:
Multi-Agent Simulator Drives Language Models for Legal Intensive Interaction. 6537-6570 - Leonardo Ranaldi, Giulia Pucci:
Exploring Backward Reasoning in Large Language Models. 6571-6586 - Yuan-Ching Kuo, Yi Yu, Chih-Ming Chen, Chuan-Ju Wang:
MMLF: Multi-query Multi-passage Late Fusion Retrieval. 6587-6598 - Weidi Luo, He Cao, Zijing Liu, Yu Wang, Aidan Wong, Bin Feng, Yuan Yao, Yu Li:
Dynamic Guided and Domain Applicable Safeguards for Enhanced Security in Large Language Models. 6599-6620 - Maya K. Nachesa, Vlad Niculae:
kNN For Whisper And Its Effect On Bias And Speaker Adaptation. 6621-6627 - Cuong Chi Le, Hoang Chau Truong Vinh, Huy Nhat Phan, Dung Duy Le, Tien N. Nguyen, Nghi D. Q. Bui:
VisualCoder: Guiding Large Language Models in Code Execution with Fine-grained Multimodal Chain-of-Thought Reasoning. 6628-6645 - Luca Moroni, Giovanni Puccetti, Pere-Lluís Huguet Cabot, Andrei Stefan Bejgu, Alessio Miaschi, Edoardo Barba, Felice Dell'Orletta, Andrea Esuli, Roberto Navigli:
Optimizing LLMs for Italian: Reducing Token Fertility and Enhancing Efficiency Through Vocabulary Adaptation. 6646-6660 - Aarón Galiano Jiménez, Juan Antonio Pérez-Ortiz, Felipe Sánchez-Martínez, Víctor M. Sánchez-Cartagena:
Beyond the Mode: Sequence-Level Distillation of Multilingual Translation Models for Low-Resource Language Pairs. 6661-6676 - Taido Purason, Hele-Andra Kuulmets, Mark Fishel:
LLMs for Extremely Low-Resource Finno-Ugric Languages. 6677-6697 - Jinhyuk Lee, Anthony Chen, Zhuyun Dai, Dheeru Dua, Devendra Singh Sachan, Michael Boratko, Yi Luan, Sébastien M. R. Arnold, Vincent Perot, Siddharth Dalmia, Hexiang Hu, Xudong Lin, Panupong Pasupat, Aida Amini, Jeremy R. Cole, Sebastian Riedel, Iftekhar Naim, Ming-Wei Chang, Kelvin Guu:
LOFT: Scalable and More Realistic Long-Context Evaluation. 6698-6723 - Juraj Vladika, Florian Matthes:
On the Influence of Context Size and Model Choice in Retrieval-Augmented Generation Systems. 6724-6736 - Gerrit J. J. van den Burg, Gen Suzuki, Wei Liu, Murat Sensoy:
Aligning Black-box Language Models with Human Judgments. 6737-6749 - Xiangyu Wen, Jianyuan Zhong, Zhijian Xu, Qiang Xu:
Guideline Compliance in Task-Oriented Dialogue: The Chained Prior Approach. 6750-6776 - Jiawei Chen, Xiao Yang, Zhengwei Fang, Yu Tian, Yinpeng Dong, Zhaoxia Yin, Hang Su:
AutoBreach: Universal and Adaptive Jailbreaking with Efficient Wordplay-Guided Optimization via Multi-LLMs. 6777-6798 - Bingfeng Chen, Chenjie Qiu, Yifeng Xie, Boyan Xu, Ruichu Cai, Zhifeng Hao:
\mathcalS²IT: Stepwise Syntax Integration Tuning for Large Language Models in Aspect Sentiment Quad Prediction. 6799-6806 - Md. Motahar Mahtab, Faisal Ahamed Khan, Md. Ekramul Islam, Md. Shahad Mahmud Chowdhury, Labib Imam Chowdhury, Sadia Afrin, Hazrat Ali, Mohammad Mamun Or Rashid, Nabeel Mohammed, Mohammad Ruhul Amin:
BanNERD: A Benchmark Dataset and Context-Driven Approach for Bangla Named Entity Recognition. 6807-6828 - Andres Algaba, Carmen Mazijn, Vincent Holst, Floriano Tori, Sylvia Wenmackers, Vincent Ginis:
Large Language Models Reflect Human Citation Patterns with a Heightened Citation Bias. 6829-6864 - Nickil Maveli, Antonio Vergari, Shay B. Cohen:
What can Large Language Models Capture about Code Functional Equivalence? 6865-6903 - Xinglin Wang, Shaoxiong Feng, Yiwei Li, Peiwen Yuan, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li:
Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning. 6904-6917 - Jiwon Jeong, Hyeju Jang, Hogun Park:
Large Language Models Are Better Logical Fallacy Reasoners with Counterargument, Explanation, and Goal-Aware Prompt Formulation. 6918-6937 - Vlad-Andrei Negru, Robert Vacareanu, Camelia Lemnaru, Mihai Surdeanu, Rodica Potolea:
MorphNLI: A Stepwise Approach to Natural Language Inference Using Text Morphing. 6938-6953 - Dorde Klisura, Anthony Rios:
Unmasking Database Vulnerabilities: Zero-Knowledge Schema Inference Attacks in Text-to-SQL Systems. 6954-6976 - Goki Muramoto, Atsuki Sato, Takayoshi Koyama:
Media of Langue: Exploring Word Translation Network. 6977-6994 - Georgina Curto, Svetlana Kiritchenko, Muhammad Hammad Fahim Siddiqui, Isar Nejadgholi, Kathleen C. Fraser:
Tackling Social Bias against the Poor: a Dataset and a Taxonomy on Aporophobia. 6995-7016 - Lee Kezar, Nidhi Munikote, Zian Zeng, Zed Sevcikova Sehyr, Naomi Caselli, Jesse Thomason:
The American Sign Language Knowledge Graph: Infusing ASL Models with Linguistic Knowledge. 7017-7029 - Mohamed Salim Aissi, Clément Romac, Thomas Carta, Sylvain Lamprier, Pierre-Yves Oudeyer, Olivier Sigaud, Laure Soulier, Nicolas Thome:
Reinforcement Learning for Aligning Large Language Models Agents with Interactive Environments: Quantifying and Mitigating Prompt Overfitting. 7030-7046 - Usneek Singh, José Cambronero, Sumit Gulwani, Aditya Kanade, Anirudh Khatry, Vu Le, Mukul Singh, Gust Verbruggen:
An empirical study of validating synthetic data for formula generation. 7047-7054 - Ananya Singha, Mukul Singh, Ashish Tiwari, Sumit Gulwani, Vu Le, Chris Parnin:
TeCoFeS: Text Column Featurization using Semantic Analysis. 7055-7061 - Xi Xu, Wenda Xu, Siqi Ouyang, Lei Li:
CA*: Addressing Evaluation Pitfalls in Computation-Aware Latency for Simultaneous Speech Translation. 7062-7067 - Zhe Wang, Yanjun Qi:
Augmented Adversarial Trigger Learning. 7068-7100 - Qiusi Zhan, Richard Fang, Henil Shalin Panchal, Daniel Kang:
Adaptive Attacks Break Defenses Against Indirect Prompt Injection Attacks on LLM Agents. 7101-7117 - Weizhe Chen, Zhicheng Zhang, Guanlin Liu, Renjie Zheng, Wenlei Shi, Chen Dun, Zheng Wu, Xing Jin, Lin Yan:
Flaming-hot Initiation with Regular Execution Sampling for Large Language Models. 7118-7127 - Sangyeop Kim, Hangyeul Lee, Yohan Lee:
HEISIR: Hierarchical Expansion of Inverted Semantic Indexing for Training-free Retrieval of Conversational Data using LLMs. 7128-7144 - Fanny Ducel, Nicolas Hiebel, Olivier Ferret, Karën Fort, Aurélie Névéol:
"Women do not have heart attacks!" Gender Biases in Automatically Generated Clinical Cases in French. 7145-7159 - Mingni Tang, Jiajia Li, Lu Yang, Zhiqiang Zhang, Jinhao Tian, Zuchao Li, Lefei Zhang, Ping Wang:
NOTA: Multimodal Music Notation Understanding for Visual Large Language Model. 7160-7173 - Juan Manuel Pérez, Paula Miguel, Viviana Cotik:
Exploring Large Language Models for Hate Speech Detection in Rioplatense Spanish. 7174-7187 - Creston Brooks, Johannes Haubold, Charlie Cowen-Breen, Jay White, Desmond DeVaul, Frederick Riemenschneider, Karthik R. Narasimhan, Barbara Graziosi:
An Annotated Dataset of Errors in Premodern Greek and Baselines for Detecting Them. 7188-7202 - João Matos, Shan Chen, Siena Placino, Yingya Li, Juan Carlos Climent Pardo, Daphna Idan, Takeshi Tohyama, David Restrepo, Luis Filipe Nakayama, Jose M. M. Pascual-Leone, Guergana K. Savova, Hugo J. W. L. Aerts, Leo Anthony Celi, An-Kwok Ian Wong, Danielle S. Bitterman, Jack Gallifant:
WorldMedQA-V: a multilingual, multimodal medical examination dataset for multimodal language models evaluation. 7203-7216 - Fabiha Haider, Fariha Tanjim Shifat, Md Farhan Ishmam, Md Sakib Ul Rahman Sourove, Deeparghya Dutta Barua, Md Fahim, Farhad Alam Bhuiyan:
BanTH: A Multi-label Hate Speech Detection Dataset for Transliterated Bangla. 7217-7236 - Yen-Ju Lu, Ting-Yao Hu, Hema Swetha Koppula, Hadi Pouransari, Jen-Hao Rick Chang, Yin Xia, Xiang Kong, Qi Zhu, Simon Wang, Oncel Tuzel, Raviteja Vemulapalli:
Mutual Reinforcement of LLM Dialogue Synthesis and Summarization Capabilities for Few-Shot Dialogue Summarization. 7237-7256 - Tyler Lizzo, Larry Heck:
UNLEARN Efficient Removal of Knowledge in Large Language Models. 7257-7268 - Jeremias Bohn, Frederic Mrozinski, Georg Groh:
Adaptive Parameter Compression for Language Models. 7269-7286 - Yijing Zhang, Dyah Adila, Changho Shin, Frederic Sala:
Personalize Your LLM: Fake it then Align it. 7287-7301 - Haitao Mao, Guangliang Liu, Yao Ma, Rongrong Wang, Kristen Marie Johnson, Jiliang Tang:
A Survey to Recent Progress Towards Understanding In-Context Learning. 7302-7323 - Youngwon Lee, Seung-won Hwang, Daniel F. Campos, Filip Gralinski, Zhewei Yao, Yuxiong He:
Inference Scaling for Bridging Retrieval and Augmented Generation. 7324-7339 - Aditya Sharma, Aman Dalmia, Mehran Kazemi, Amal Zouaq, Christopher Pal:
GeoCoder: Solving Geometry Problems by Generating Modular Code through Vision-Language Models. 7340-7356 - Meng-Chen Wu, Md Mosharaf Hossain, Tess Wood, Shayan Ali Akbar, Si-Chi Chin, Erwin Cornejo:
SEEval: Advancing LLM Text Evaluation Efficiency and Accuracy through Self-Explanation Prompting. 7357-7368 - Leonardo Ranaldi, Barry Haddow, Alexandra Birch:
When natural language is not enough: The limits of in-context learning demonstrations in multilingual reasoning. 7369-7396 - Tunazzina Islam, Dan Goldwasser:
Uncovering Latent Arguments in Social Media Messaging by Employing LLMs-in-the-Loop Strategy. 7397-7429 - Aleksandr Fedchin, Isabel Cooperman, Pramit Chaudhuri, Joseph P. Dexter:
AcrosticSleuth: Probabilistic Identification and Ranking of Acrostics in Multilingual Corpora. 7430-7437 - Xiaotang Gai, Chenyi Zhou, Jiaxiang Liu, Yang Feng, Jian Wu, Zuozhu Liu:
MedThink: A Rationale-Guided Framework for Explaining Medical Visual Question Answering. 7438-7450 - Yan Meng, Di Wu, Christof Monz:
How to Learn in a Noisy World? Self-Correcting the Real-World Data Noise in Machine Translation. 7451-7467 - Joel Mire, Zubin Trivadi Aysola, Daniel Chechelnitsky, Nicholas Deas, Chrysoula Zerva, Maarten Sap:
Rejected Dialects: Biases Against African American Language in Reward Models. 7468-7487 - Viet Cuong Nguyen, Mohammad Taher, Dongwan Hong, Vinicius Konkolics Possobom, Vibha Thirunellayi Gopalakrishnan, Ekta Raj, Zihang Li, Heather J. Soled, Michael L. Birnbaum, Srijan Kumar, Munmun De Choudhury:
Do Large Language Models Align with Core Mental Health Counseling Competencies? 7488-7511 - Zizhang Chen, Peizhao Li, Xiaomeng Dong, Pengyu Hong:
Uncertainty Quantification for Clinical Outcome Predictions with (Large) Language Models. 7512-7523 - Shrinidhi Kumbhar, Venkatesh Mishra, Kevin Coutinho, Divij Handa, Ashif Iquebal, Chitta Baral:
Hypothesis Generation for Materials Discovery and Design Using Goal-Driven and Constraint-Guided LLM Agents. 7524-7555 - Logan Barnhart, Reza Akbarian Bafghi, Stephen Becker, Maziar Raissi:
Aligning to What? Limits to RLHF Based Alignment. 7556-7591 - Srishti Yadav, Zhi Zhang, Daniel Hershcovich, Ekaterina Shutova:
Beyond Words: Exploring Cultural Value Sensitivity in Multimodal Models. 7592-7608 - Jeffrey Olmo, Jared Wilson, Max Forsey, Bryce Hepner, Thomas Vin Howe, David Wingate:
Features that Make a Difference: Leveraging Gradients for Improved Dictionary Learning. 7609-7619 - Botao Yu, Frazier N. Baker, Ziru Chen, Garrett Herb, Boyu Gou, Daniel Adu-Ampratwum, Xia Ning, Huan Sun:
Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving. 7620-7640 - Viacheslav Vasilev, Julia Agafonova, Nikolai Gerasimenko, Alexander Kapitanov, Polina Mikhailova, Evelina Mironova, Denis Dimitrov:
RusCode: Russian Cultural Code Benchmark for Text-to-Image Generation. 7641-7657 - Nikita Soni, Pranav Chitale, Khushboo Singh, Niranjan Balasubramanian, H. Andrew Schwartz:
Evaluation of LLMs-based Hidden States as Author Representations for Psychological Human-Centered NLP Tasks. 7658-7667 - Xiaoyu Liu, Paiheng Xu, Junda Wu, Jiaxin Yuan, Yifan Yang, Yuhang Zhou, Fuxiao Liu, Tianrui Guan, Haoliang Wang, Tong Yu, Julian J. McAuley, Wei Ai, Furong Huang:
Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey. 7668-7684 - Yizhou Chi, Kevin Yang, Dan Klein:
ThoughtSculpt: Reasoning with Intermediate Revision and Search. 7685-7711 - Ivan Lee, Taylor Berg-Kirkpatrick:
Optimizing Hidden Markov Language Models: An Empirical Study of Reparameterization and Initialization Techniques. 7712-7723 - Mina J. Kian, Kaleen Shrestha, Katrin Fischer, Xiaoyuan Zhu, Jonathan Ong, Aryan Trehan, Jessica Wang, Gloria Chang, Sébastien M. R. Arnold, Maja Mataric:
Using Linguistic Entrainment to Evaluate Large Language Models for Use in Cognitive Behavioral Therapy. 7724-7743 - Rahul Porwal, Alice Rozet, Jotsna Gowda, Pryce Houck, Kevin Tang, Sarah Moeller:
Analysis of LLM as a grammatical feature tagger for African American English. 7744-7756 - Anton Razzhigaev, Matvey Mikhalchuk, Temurbek Rahmatullaev, Elizaveta Goncharova, Polina Druzhinina, Ivan Oseledets, Andrey Kuznetsov:
LLM-Microscope: Uncovering the Hidden Role of Punctuation in Context Memory of Transformers. 7757-7764 - Xiaonan Jing, Srinivas Billa, Danny Godbout:
On A Scale From 1 to 5: Quantifying Hallucination in Faithfulness Evaluation. 7765-7780 - Paul Rosu:
LITERA: An LLM Based Approach to Latin-to-English Translation. 7781-7794 - Venkatesh Mishra, Bimsara Pathiraja, Mihir Parmar, Sat Chidananda, Jayanth Srinivasa, Gaowen Liu, Ali Payani, Chitta Baral:
Investigating the Shortcomings of LLMs in Step-by-Step Legal Reasoning. 7795-7826 - Siyi Liu, Kishaloy Halder, Zheng Qi, Wei Xiao, Nikolaos Pappas, Phu Mon Htut, Neha Anna John, Yassine Benajiba, Dan Roth:
Towards Long Context Hallucination Detection. 7827-7835 - Haoteng Yin, Jinha Kim, Prashant Mathur, Krishanu Sarker, Vidit Bansal:
How to Talk to Language Models: Serialization Strategies for Structured Entity Matching. 7836-7850 - Anthony Sicilia, Mert Inan, Malihe Alikhani:
Accounting for Sycophancy in Language Model Uncertainty Estimation. 7851-7866 - Jishnu Ray Chowdhury, Jayanth Mohan, Tomás Malík, Cornelia Caragea:
Zero-Shot Keyphrase Generation: Investigating Specialized Instructions and Multi-sample Aggregation on Large Language Models. 7867-7884 - Lisa Alazraki, Marek Rei:
Meta-Reasoning Improves Tool Use in Large Language Models. 7885-7897 - Abe Bohan Hou, Orion Weller, Guanghui Qin, Eugene Yang, Dawn J. Lawrie, Nils Holzenberger, Andrew Blair-Stanek, Benjamin Van Durme:
CLERC: A Dataset for U. S. Legal Case Retrieval and Retrieval-Augmented Analysis Generation. 7898-7913 - Allahsera Auguste Tapo, Nouhoum Souleymane Coulibaly, Seydou Diallo, Sébastien Diarra, Christopher M. Homan, Mamadou K. Keita, Michael Leventhal:
GAIfE: Using GenAI to Improve Literacy in Low-resourced Settings. 7914-7929 - Tiberiu Sosea, Cornelia Caragea:
Hard Emotion Test Evaluation Sets for Language Models. 7930-7944 - Ruoli Gan, Duanyu Feng, Chen Zhang, Zhihang Lin, Haochen Jia, Hao Wang, Zhenyang Cai, Lei Cui, Qianqian Xie, Jimin Huang, Benyou Wang:
UCL-Bench: A Chinese User-Centric Legal Benchmark for Large Language Models. 7945-7988 - Yan Li, So-Eon Kim, Seong-Bae Park, Soyeon Caren Han:
MIDAS: Multi-level Intent, Domain, And Slot Knowledge Distillation for Multi-turn NLU. 7989-8012 - Kian Ahrabian, Xihui Lin, Barun Patra, Vishrav Chaudhary, Alon Benhaim, Jay Pujara, Xia Song:
A Practical Analysis of Human Alignment with *PO. 8013-8021 - Yixin Liu, Pengfei Liu, Arman Cohan:
Understanding Reference Policies in Direct Preference Optimization. 8022-8037 - Saaket Agashe, Yue Fan, Anthony Reyna, Xin Eric Wang:
LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Models. 8038-8057 - Vaishnavi Pulavarthi, Deeksha Nandal, Soham Dan, Debjit Pal:
AssertionBench: A Benchmark to Evaluate Large-Language Models for Assertion Generation. 8058-8065 - Sihao Chen, Chaitanya Malaviya, Alex Fabrikant, Hagai Taitelbaum, Tal Schuster, Senaka Buthpitiya, Dan Roth:
On Reference (In-)Determinacy in Natural Language Inference. 8066-8078 - Yicheng Wang, Jiayi Yuan, Yu-Neng Chuang, Zhuoer Wang, Yingchi Liu, Mark Cusick, Param Kulkarni, Zhengping Ji, Yasser Ibrahim, Xia Hu:
DHP Benchmark: Are LLMs Good NLG Evaluators? 8079-8094 - Qiming Wu, Zichen Chen, Will Corcoran, Misha Sra, Ambuj K. Singh:
GraphEval36K: Benchmarking Coding and Reasoning Capabilities of Large Language Models on Graph Datasets. 8095-8117 - Qi Jia, Xiang Yue, Tuney Zheng, Jie Huang, Bill Yuchen Lin:
SimulBench: Evaluating Language Models with Creative Simulation Tasks. 8118-8131 - Millennium Bismay, Xiangjue Dong, James Caverlee:
ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning. 8132-8148 - Shilong Li, Yancheng He, Hui Huang, Xingyuan Bu, Jiaheng Liu, Hangyu Guo, Weixun Wang, Jihao Gu, Wenbo Su, Bo Zheng:
2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision. 8149-8173 - Yu Wang, Ryan A. Rossi, Namyong Park, Nesreen K. Ahmed, Danai Koutra, Franck Dernoncourt, Tyler Derr:
Demystifying the Power of Large Language Models in Graph Generation. 8174-8189 - Yuelin Bai, Xeron Du, Yiming Liang, Leo Jin, Junting Zhou, Ziqiang Liu, Feiteng Fang, Mingshan Chang, Tianyu Zheng, Xincheng Zhang, Nuo Ma, Zekun Moore Wang, Ruibin Yuan, Haihong Wu, Hongquan Lin, Wenhao Huang, Jiajun Zhang, Chenghua Lin, Jie Fu, Min Yang, Shiwen Ni, Ge Zhang:
COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning. 8190-8205 - Yu Wang, Jiaxin Zhang, Xiang Gao, Wendi Cui, Peng Li, Kamalika Das:
Gradient-guided Attention Map Editing: Towards Efficient Contextual Hallucination Mitigation. 8206-8217 - Yue Zhang, Leyang Cui, Wei Bi, Shuming Shi:
Alleviating Hallucinations of Large Language Models through Induced Hallucinations. 8218-8232 - Lin Ning, Harsh Lara, Meiqi Guo, Abhinav Rastogi:
MoDE: Effective Multi-task Parameter Efficient Fine-Tuning with a Mixture of Dyadic Experts. 8233-8246 - Zhilan Wang, Zekai Zhi, Rize Jin, Kehui Song, He Wang, Da-Jung Cho:
Unsupervised Sentence Representation Learning with Syntactically Aligned Negative Samples. 8247-8259 - Shensian Syu, Hung-yi Lee:
Hierarchical Speculative Decoding with Dynamic Window. 8260-8273 - CheolWon Na, YunSeok Choi, Jee-Hyong Lee:
Q-FAKER: Query-free Hard Black-box Attack via Controlled Generation. 8274-8289 - Xiang Li, Zhiyi Yin, Hexiang Tan, Shaoling Jing, Du Su, Yi Cheng, Huawei Shen, Fei Sun:
PRDetect: Perturbation-Robust LLM-generated Text Detection Based on Syntax Tree. 8290-8301 - Ahmed Elshabrawy, Yongxin Huang, Iryna Gurevych, Alham Fikri Aji:
Enabling Natural Zero-Shot Prompting on Encoder Models via Statement-Tuning. 8302-8321 - Kritarth Prasad, Mohammadi Zaki, Pratik Rakesh Singh, Pankaj Wasnik:
Faster Machine Translation Ensembling with Reinforcement Learning and Competitive Correction. 8322-8335 - Rahmad Mahendra, Damiano Spina, Lawrence Cavedon, Karin Verspoor:
Evaluating Numeracy of Language Models as a Natural Language Inference Task. 8336-8361 - Poulami Ghosh, Raj Dabre, Pushpak Bhattacharyya:
Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages. 8362-8396 - Seungbeen Lee, Seungwon Lim, Seungju Han, Giyeong Oh, Hyungjoo Chae, Jiwan Chung, Minju Kim, Beong-woo Kwak, Yeonsoo Lee, Dongha Lee, Jinyoung Yeo, Youngjae Yu:
Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics. 8397-8437 - Myrthe Reuver, Indira Sen, Matteo Melis, Gabriella Lapesa:
Tell Me What You Know About Sexism: Expert-LLM Interaction Strategies and Co-Created Definitions for Zero-Shot Sexism Detection. 8438-8467 - Jie Chi, Maureen de Seyssel, Natalie Schluter:
The Role of Prosody in Spoken Question Answering. 8468-8479 - Palaash Goel, Dushyant Singh Chauhan, Md. Shad Akhtar:
Target-Augmented Shared Fusion-based Multimodal Sarcasm Explanation Generation. 8480-8493 - Patricia Chiril, Trevor Spreadbury, Joeva Rock, Brian Dowd-Uribe, David Uminsky:
Seeds of Discourse: A Multilingual Corpus of Direct Quotations from African Media on Agricultural Biotechnologies. 8494-8500 - Xianjun Yang, Wei Cheng, Xujiang Zhao, Wenchao Yu, Linda Ruth Petzold, Haifeng Chen:
Position Really Matters: Towards a Holistic Approach for Prompt Tuning. 8501-8523

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.