Important Dates
| Pre-submission mentorship application |
|
| Pre-submission mentorship feedback |
|
| Submission deadline |
|
| ARR commitment deadline |
|
| Acceptance notification |
|
| Camera-ready due |
|
| Grant application submission due |
|
| Grant application notification |
|
| Workshop | July 5-7 (With Main Conference) |
Archival
Multi-Agent Reasoning Improves Compute Efficiency: Pareto-Optimal Test-Time Scaling
Florian Valentin Wunderlich; Lars Benedikt Kaesberg; Jan Philip Wahle; Terry Ruas; Bela Gipp
Claim Verification in the Age of Large Language Models: A Survey
Alphaeus Dmonte; Roland R Oruche; Marcos Zampieri; Prasad Calyam; Isabelle Augenstein
Language Directions in Multilingual LLMs: A Layer-wise Diagnostic Study of Token Alignment and Pretraining Imprint
Jea Sung Kim; Suan Lee
Emergence of Minimal Circuits for Indirect Object Identification in Attention-Only Transformers
Rabin Adhikari
Why Large Language Models can Secretly Outperform Embedding Similarity in Information Retrieval
Matei Benescu; Ivo Pascal de Jong
Detecting Hallucinations in Large Language Models via Internal Attention Divergence Signals
Gijs van Dijk
Reflection in the Dark: Exposing and Escaping the Black Box in Reflective Prompt Optimization
Shiyan Liu; Qifeng Xia; Qiyun Xia; Yisheng Liu; Xinyu Yu; Rui Qu
Thesis Proposal: LLMs post-training for multilingual medical tasks. Instruction-Tuning, Continual-Pretraining or Reasoning?
Pietro Ferrazzi; Alberto Lavelli; Bernardo Magnini
Peek2: Regex-free Byte-level Byte-Pair Encoding Pretokenizer for LLM Inference on Edge Devices
Liu Zai; Iraklis A. Klampanos
Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning
Brady Steele
Semantic Contrastive Adaptation for Multimodal Figurative Language Understanding
Ayaan Siddiqui
Think Less, Code Better: Probing When Chain-of-Thought Hurts and How to Route Around It
Rajarshi Ghoshal; Salma Emad Mahmoud Abdelhalim; Debadri Basak; Pratibha kaur arora
Probing Functional Correctness in Diffusion Language Models
Guan-Ming Chiu; Jeng-Yue Liu
Thesis Proposal: Uncertainty as Adaptive Control: From Selection to Curriculum via Conformal Calibration
Peihong Li; Yan Yan
Thesis Proposal: On the Granularity-Robustness Trade-off in Text-Derived Knowledge Graphs
Surawat Pralomram
TokLens: A Multilingual Lens on Tokenizer Quality for LLMs
Guan-Ming Chiu
Phase Transitions in Affective Meaning Divergence: The Hidden Drift Before the Break
Napassorn Litchiowong
Sycophantic Anchors: Localizing and Quantifying User Agreement in Reasoning Models
Jacek Duszenko; Przemyslaw Kazienko; Jan Kocon
NEAT-IR: Neural Explainable Analysis Tool for Information Retrieval
Lev Sukherman; Artem Frenk; Nina Klimenkova; Connor Jason
BanglaSocialBench: A Benchmark for Evaluating Sociopragmatic and Cultural Alignment of LLMs in Bangladeshi Social Interaction
Tanvir Ahmed Sijan; S. M Golam Rifat; Pankaj Chowdhury Partha; Md. Tanjeed Islam; Md Musfique Anwar
Interpretability of LLM Classifiers via the Rational Inattention Theory with Application to Hate Speech Detection
Yuan Zhao; Ali Abdi
The Shape of Vulnerability: How Adversarial Perturbations Reshape the Topology of Language Model Latent Spaces
Angelina Tsai; Shreya Subramanian; Catherine Liu; Kimberly Lopez; Leif Zinn-Brooks; Alexia Schulz; Adaku Uchendu
LLM-based Literal Example Generation for Japanese Multiword Expressions
Mio Ohashi; Hajime Kiyama; Zhidong Ling; Mamoru Komachi
Presentation Slide Translation and Layout Error Correction by LLMs
Futo Kajita; Nobuyori Nishimura; Takehito Utsuro; Naoki Muto; Chee Siang Leow; Hiromitsu Nishizaki
Constructing a Japanese Rap Lyric Generation Model with GRPO
Hayato Ogawa; Daisuke Kawahara
Tracking the Evolution of Foresight Signals in News Data: The Case of the European Electric Vehicle Market
Karine Navasartian
Cultural Value Alignment Via Latent Activation Steering in Large Language Models
Trung Duc Anh Dang; Sarah Masud
Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data
Navyansh Singh
Thesis Proposal: Bring Linguistics Back to Cryptanalysis - Using Attestation to Break the Advanced Encryption Standard
Madeline Boese
Garden Path Recovery in Causal and Masked Language Models
Sanjan Baitalik; Rajashik Datta
Confidence as a Tie-Breaker: Reassessing Multilingual Hedging Bias in LLM-as-a-Judge Evaluation
Rajashik Datta; Sanjan Baitalik
BanglaSTEM: A Parallel Corpus and Term-Weighted Evaluation for Technical Bangla-English Translation
Kazi Reyazul Hasan; A. B. M. Alim Al Islam; Muhammad Abdullah Adnan
Believing is Seeing: How Token Inflation Mechanistically Erodes Theory of Mind in Large Language Models
Zhizhi Wang; Ruochen Zhang
Disentangling the Effects of Unlearning in Measuring Parametric Faithfulness of Chain-of-Thought
Ryo Mitsuhashi; Gaku Morio; Ayana Niwa; Masahiro Kaneko; Kentaro Inui; Terufumi Morishita; Yuta Koreeda; Yasuhiro Sogawa
FedPAGR: Federated Prototype Alignment via Geometric Refinement for Heterogeneous Architectures
Kris Prasad; Md Abdullah Al Hafiz Khan
Sentiment Analysis of Yelp Review Dataset: A Comparative Study of Machine Learning Methods
Krishna Thakar; Mohamed Abu Sheha; Emmanuel Thompson
Semantic Span Annotation: An Exploratory Study of LLM Annnotation
Tejas Goyal; Dhriti Krishnan; Anuj Gupta; Jaromir Savelka
Thesis Proposal: An Explainable Multimodal Framework for Detecting Harmful Content in Code-Switched Children’s Media
Juliana Isabelle A. Guillermo; Jasper Kyle Catapang; Nathaniel Oco
Test-Time Strategies for More Efficient and Accurate Agentic RAG
Abhinav Sharma; Brian Zhang; Deepti Guntur; Zhiyang Zuo; Shreyas Chaudhari; Wenlong Zhao; Franck Dernoncourt; Puneet Mathur; Ryan A. Rossi; Nedim Lipka
Eye Movement Features Can Predict Human Preferences on Machine-Generated Texts
Xiaoshan He; Xiaoqun Liu; Haodong He; Yu Wang; Yang Xu
Thesis Proposal: Diagnosing and Mitigating Semantic Interference in Script-Sharing Low-Resource Language Models: A Case Study on Square Bai Script
Jingting Zheng; Deyi Xiong
Does Locality Cost in Polish Medical Text Classification? Duplicate-Aware Evaluation of Federated Learning
Daniel Cieślak; Andrzej Czyżewski
Analyzing Hate Speech Amplification on Fringe Platforms
Anika Ghosh Basu
The Silence of the Facts: Popularity as a Barrier to Machine Unlearning
Anna Borisiuk; Andrey Savchenko; Alexander Panchenko; Elena Tutubalina
Leakage-Aware User-Level ADHD Signal Classification from Social Media: When Graph Aggregation Helps, and When It Does Not
Daniel Cieślak; Władysław Średniawa
CAL-Log: Cost-Aware Active Learning with Logarithmic Cognitive Effort Modeling and Online Adaptation to Human Annotation Behavior
Vihanga Supasan Kariyakaranage; Banuka Athuraliya
Thesis Proposal: Targeted and Unified Cross-Lingual Unlearning from Multilingual Language Models
Jan Bronec; Jindřich Helcl
A11y-Compressor: A Framework for Enhancing the Efficiency of GUI Agent Observations through Visual Context Reconstruction and Redundancy Reduction
Michito Takeshita; Takuro Kawada; Takumi Ohashi; Shunsuke Kitada; Hitoshi Iyatomi
Beyond Static Cropping: Layer-Adaptive Visual Localization and Decoding Enhancement
Zipeng Zhu; Zhanghao Hu; Qinglin Zhu; Jingyong Su; Yulan He; Lin Gui
Counterspeech Generation using Small Language Models
Abubakar Sadiq Muhammad; Simona Frenda; Gavin Abercrombie
From Graphs to Hypergraphs: Enhancing Aspect-Term Sentiment Analysis via Multi-Level Relational Modeling
Omkar Mahesh Kashyap; Padegal Amit; Madhav Kashyap; Ashwini M Joshi; Shylaja S S
Probing Bias Formation in Medical LLMs through Activation Steering
Bayram Ayadi; Annette Hautli-Janisz
Faithfulness Beyond Plausibility: Auditing Human Explanations in Educational Assessment
Ria Talsania; Dhruv Ritesh Shah; Sudhir Dhage
CBAL: Context-Based Agentic Learning for Speaker Diarization Segmentation Refinement
Odwitiyo Dutta; Dinesh K Vishwakarma
Measuring and Mitigating Shortcut Reliance in Language Models with Probe-Based Representation Entanglement
Divyajot Singh
LAMP-MedQA: A Lightweight Multi-Agent System for Patient-Oriented Medical Question Answering
Jack A. Johnson; Meghali Banerjee; Joseph Crawford; James Welch; Jim Davies; Tingyan Wang
Inference-Time Feedback for Reasoning Controllability in Diffusion Language Models
Clovis Barbour; Huixin Zhan
Disentangling Linguistic Relatedness from Task Alignment in Cross-Lingual Transfer
Ahmed Haj Ahmed; Ruochen Zhang; Alvin C Grissom II
PE-QAT: Parameter-Efficient Quantization-Aware Training for Large Language Models
Shresth Mishra
Fusion Training for Mathematical Generalization in Large Language Models
Congfeng Cao; Pengyu Zhang; Jelke Bloem
Does Topic Sentiment Cause Perceived Ideology? Comparing Human and LLM Annotations in Political News Articles
Upasana Chatterjee
Understanding Conversational Implicatures in Humans and LLMs
Daeun Kang
Thesis Proposal: Self-Adaptive and Epistemic Uncertainty-Guided ASR of Dense Intra-Sentential Code-Switched Speech for African Low-Resource Languages
Umar Baba Umar
RegTrack: A Fine-Grained Benchmark for Multi-Class Legal Change Detection
Joe Yu; Kevin Chenhao Li; Julian Ostarek
Validator-Guided Hard Negative Mining for Masked Language Modeling in Low-Resource Ancient Languages
Andrei Voinea
Conformal LLM Routing with Distribution-Free Safety Guarantees
Iqtedar Uddin; André Bauer
Supervision versus Demonstration-Based In-Context Learning for Multiword Expression Classification
Sercan Karakas; Yusuf ŞİMŞEK
When Models Hesitate: Answer Instability as a Label-Free Uncertainty Signal for LLMs
Jasper Meynard Arana; Kristine Ann M. Carandang; Ethan Robert Casin; Christian Alis; Christopher Monterola
HARP: Representation-Based Preference Learning for Perceptual Data
Jordan Sinclair; Yousra Shleibik; Kerstin Haring
Thesis Proposal: Toward a Human-Centered and Perspective-Aware Framework for Reproducible ML Evaluation and AI Alignment
Deepak Pandita; Christopher M Homan
One Panel Does Not Fit All: Case-Adaptive Multi-Agent Deliberation for Clinical Prediction
Yuxing Lu; Yushuhong Lin; Jason Zhang
LLMs for Now, Fine-Tuning for Later: An Ensemble Approach to Data Drift in Domain-Specific Tasks
Yuxuan Lu; Bingsheng Yao; Shao Zhang; Yisi Sang; Yun Wang; Hansu Gu; Peng Zhang; Tun Lu; Toby Jia-Jun Li; Dakuo Wang
Thesis Proposal: When Does an Agent Know It Is Lost? Confidence Trajectory Analysis for Tool-Using LLMs
Zhenjiang Mao
Task Assignment meets Annotator Modeling: Human-LLM Collaborative Annotation with Constraints
Kei Moriyama; Kouta Nakayama; Yukino Baba
Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation
Luke Zhang; Justin Vasselli; Aditya Khan; York Hay Ng; En-Shiun Annie Lee
Tonal Salience in Cognitive Decline: In-Context MCI Detection with Multimodal LLMs
Christopher Song; Abdullah P. Rashed Ahmed
What Moves the Pareto Frontier in Tool-Using Agents? A Compute-Aware Study of ReAct Variants
Rishi N. Simhadri
Filling the Long Tail: Structure-Aware Curriculum-Gap Completion for Medical Education with LLMs
Wenjie Lin
Mechanistic Analysis Of Universality: Numerical Comparison Circuits Across Transformer Architectures
Arya Bhardia; Julian Ramirez; Siddhanta Verma; Karen Mkrtchyan
How Hard is Math? Using Quantitative Metrics to Measure LLM Alignment to Human Intuitions of Difficulty
Micah Helzerman; Steven R Wilson; Cam McLeman
Fine-Grained Semantic Comparison of Legal Documents using LLMs
Elisei Rykov; Nikolay Ivanov; Maria Bandulevich; Kseniia Petrushina; Valentin Malykh; Vasily Konovalov; Alexander Panchenko; Ilseyar Alimova
From Fluent to Useful: Generative AI That Models Purpose, Audience, and Presenter for Scientific Communication
Ishani Mondal
Beyond Discrete Search: Divergent Thinking as Intention Optimization in Latent Space
Mateusz Bystroński; Grzegorz Piotrowski; Tomasz Jan Kajdanowicz
Boosting Self-Consistency with Ranking
Maria Marina; Daniil Moskovskiy; Sergey Pletenev; Mikhail Salnikov; Alexander Panchenko; Viktor Moskvoretskii
LLM-Based Zero-Shot Soft Labeling for Anticipating Disagreement in Negotiation Dialogues
Ken Watanabe; Katsuhide Fujita
Analysis of the Neglect-Zero Effect in Large Language Models
Jin Tanaka; Daiki Matsuoka; Ryoma Kumon; Hitomi Yanaka
Morphology-Aware Multi-Granularity Representation Learning for Agglutinative Languages
zhonghao zhang; NA LIU; Jiajia Ma; Nier Wu; Guiping Liu
An Incremental CYK Recognizer for GPU-accelerated General Context-free Prefix Validation
Jiacheng Zhang; Ayesha Khatun; Steven Bethard
Processing Inconsistency Predicts Language Competence: LLM Evaluation Without Answer Labels on Turkic Languages
Ilya Galyukshev; Ilseyar Alimova
TableMBR: Minimum Bayes Risk Table Generation Based on Structural Consistency
Yoshida Daiki; Hiroyuki Deguchi; Yusuke Sakai; Hidetaka Kamigaito; Taro Watanabe
Through the Looking Glass of Multilingual AI: Contrasting Language- and Name Script-Dependent Ethnic Hierarchies in GPT and DeepSeek
Jonathan Sakunkoo; Annabella Sakunkoo
Thesis Proposal: Auditing and Mitigating Demographic Bias in Multi-Stage Retrieval Systems for Criminal Justice Applications
Archan Dutta
Contextual Diversity Measure (CDM) for Controllable Story Generation in Large Language Models
Richard Susilo; Hanna Suominen; Patrik Haslum
Constructing a Japanese Verdict Prediction Dataset for Fact-Checking of LLM-Generated Texts
Miwa Masano; Hirokazu Kiyomaru; Atsushi Keyaki; Kaito Horio; Rei Minamoto; Ribeka Keyaki; Kouta Nakayama; Hideyuki Tachibana; Daisuke Kawahara
Calibration vs Decision Making: Revisiting the Reliability Paradox in Unlearned Language Models
Divyaksh Shukla; Ashutosh Modi
Disentangling Meaning and Language Components in Diverse Multilingual Sentence Embeddings
Kanade Nonomura; Keita Fukushima; Risa Kondo; Tomoyuki Kajiwara
Linguistically-Informed Evaluation of LLMs on Acceptability Judgments in a Forced-Choice Paradigm
Ziyue Liu; Nils Reiter
Representing Lean Proofs as Trajectories in Latent Space
Elisaveta Samoylov; Soroush Vosoughi
Evaluation of Multilingual Ability to Use Spatial Deictic Expressions in Vision-Language Models
Kaito Watanabe; Taisei Yamamoto; Tomoki Doi; Hitomi Yanaka
LLM Parameters for Math Across Languages: Shared or Separate?
Behzad Shomali; Luisa Victor; Tim Selbach; Ali Hamza Bashir; David Berghaus; Joachim Koehler; Mehdi Ali; Markus Frey
Thesis Proposal: Intentional Inference for Insight Generation
Kristýna Onderková
Reference-Free Schema Generation for Literature Review Tables via Multi-Faceted Rewards
Sinjoy Saha; Suman Saha; Mahfuza Farooque; Wenpeng Yin
Factual State Discovery Benchmark: Evaluating Fact Elicitation in Polish Tax Law
Mateusz Bystroński; Kamil Tagowski; Denis Janiak; Julia Farganus; Lukasz Augustyniak; Monika Kajdanowicz; Tomasz Jan Kajdanowicz
Evolutionary Search for Automated Design of Uncertainty Quantification Methods
Mikhail Seleznyov; Daniil Korbut; Viktor Moskvoretskii; Oleg Somov; Alexander Panchenko; Elena Tutubalina
Thesis Proposal: A Normalization-First Framework for Sound, Complete, and Utility-Ready Open Information Extraction
Chandan Prakash; Pavan Kumar Chittimalli; Arnab Bhattacharya
Mind the Gap: Multilingual Divide in LLM Bias Detection and Reasoning
Medha Hira; Prachi Goyal; Raj Maheshwari; Arnav Goel
Multi-Constraint State Tracking with Negation: A Diagnostic Benchmark for LLM World Modeling
Ayan Sar; Pranav Singh Puri; Sumit Aich; Anurag Kaushish; Tanupriya Choudhury; Ajith Abraham
Learning Shortcut Models for Efficient Recursive Reasoning
Shiv Shankar
Convergent Demographic Utility Hierarchies: Geometry of Intersectional Values in LLMs
Pravish Sainath
CRL-Prompt: Contrastive and Reinforcement Learning for Soft Prompt Tuning for Text Classification
Danila Lapokin; Andrey Savchenko
Optimizing Packing and Shuffling Strategies for Enhanced Performance in Generative Language Models
Yanbing Chen; Ruilin Wang; Zihao Yang; Lavender Yao Jiang; Eric Karl Oermann
LLM as a Meta-Judge: Synthetic Data for NLP Evaluation Metric Validation
Lukáš Eigler; Jindřich Libovický; David Hurych
Continuous Context Sampling Allows Extending Diversity Boundaries of Large Language Models
Mateusz Bystroński; Doheon Han; Nitesh V. Chawla; Tomasz Jan Kajdanowicz
One Task Vector is not Enough: A Large-Scale Study for In-Context Learning
Pavel Tikhonov; Ivan Oseledets; Elena Tutubalina
Non-archival
RAQE: Reranker-Aligned Query Expansion via Label-Free Group-Relative Policy Optimization
Gyeonghun Sun; Jeonghwan Choi; Hwanjun Song
Neural KWIC: Inducing Contextualized Word Embeddings from KWIC Concordance Examples
Mao Shimada; Hajime Kiyama; Zhidong Ling; Mamoru Komachi; Toshinobu Ogiso; Hiroya Takamura; Daichi Mochihashi
Thesis Proposal: Establishing Rigorous Evaluation of Sycophancy in Pretrained Language Models
Jan Batzner
Identifying the Convergent Sycophancy Gap in Model Evaluations
Jan Batzner; Volker Stocker; Stefan Schmid; Gjergji Kasneci
Understanding Clinical Cognitive Dialogues Using Large Language Models
Vishalakshi Arumugam; Dan Schumacher; Veronica Rammouz; Enrique Gonzalez; Jeremy J Davis; Anthony Rios
MetaCog-Bench: Quantifying the Metacognition Gap in Edge LLM Tool Calling Under Information Insufficiency
Yu-An Lu; Chun-En Hsiao; Chengwei Chiang; Hong-Han Shuai
EnsemHalDet: Robust VLM Hallucination Detection via Ensemble of Internal State Detectors
Ryuhei Miyazato; Shunsuke Kitada; Kei Harada
Thesis Proposal: Sensitivity of MT Evaluation Metrics to Semantic Errors: A Case Study on Swedish–Finnish Translation
Nuo Xu
RECON: Benchmarking Agent Memory for Compositional Reasoning over Long Contexts
Mihir Shriniwas Arya
Thesis Proposal: Rethinking Safety Evaluation in Large Language Models
Khaoula Chehbouni
Dissociating Circuit-Level and Distribution-Level Effects of Knowledge Conflicts in LLMs
Pravish Sainath
The signal is coming from inside the noun phrase! Tracking semantic proto-role inferences during sentence processing
Lucas Y. Li; Zander Lynch; Marten Van Schijndel
The Confident Liar: Diagnosing Multi-Agent Debate with Log-Probabilities and LLM-as-Judge
Ali Keramati; Justin Cheok; Jacob Horne; Mark Warschauer
Adaptive Data Collection for Latin-American Community-sourced Evaluation of Stereotypes (LACES)
Guido Ivetta; Pietro Palombini; Sofía Martinelli; Marcos J Gomez; M Emilia Echeveste; Sunipa Dev; Vinodkumar Prabhakaran; Luciana Benotti
Metadata Conditioned Large Language Models for Localization
Anjishnu Mukherjee; Ziwei Zhu; Antonios Anastasopoulos
Think Anywhere in Code Generation
Xue Jiang; Tianyu Zhang; Ge Li; Mengyang Liu; Taozhi Chen; Zhenhua Xu; Wenpin Jiao; Zhi Jin; Yihong Dong