Important Dates

* All deadlines are calculated at 11:59 pm
UTC-12 hours ("anywhere on Earth")

Pre-submission mentorship application Feb 4 (Wed), 2026
Pre-submission mentorship feedback Feb 25 (Wed), 2026
Submission deadline Mar 18 (Wed), 2026
ARR commitment deadline Apr 15 (Wed), 2026
Acceptance notification Apr 24 (Fri), 2026
Camera-ready due May 8 (Fri), 2026
Grant application submission due May 8 (Fri), 2026
Grant application notification May 20 (Wed), 2026
Workshop July 5-7 (With Main Conference)

Archival

Multi-Agent Reasoning Improves Compute Efficiency: Pareto-Optimal Test-Time Scaling
Florian Valentin Wunderlich; Lars Benedikt Kaesberg; Jan Philip Wahle; Terry Ruas; Bela Gipp

Claim Verification in the Age of Large Language Models: A Survey
Alphaeus Dmonte; Roland R Oruche; Marcos Zampieri; Prasad Calyam; Isabelle Augenstein

Language Directions in Multilingual LLMs: A Layer-wise Diagnostic Study of Token Alignment and Pretraining Imprint
Jea Sung Kim; Suan Lee

Emergence of Minimal Circuits for Indirect Object Identification in Attention-Only Transformers
Rabin Adhikari

Why Large Language Models can Secretly Outperform Embedding Similarity in Information Retrieval
Matei Benescu; Ivo Pascal de Jong

Detecting Hallucinations in Large Language Models via Internal Attention Divergence Signals
Gijs van Dijk

Reflection in the Dark: Exposing and Escaping the Black Box in Reflective Prompt Optimization
Shiyan Liu; Qifeng Xia; Qiyun Xia; Yisheng Liu; Xinyu Yu; Rui Qu

Thesis Proposal: LLMs post-training for multilingual medical tasks. Instruction-Tuning, Continual-Pretraining or Reasoning?
Pietro Ferrazzi; Alberto Lavelli; Bernardo Magnini

Peek2: Regex-free Byte-level Byte-Pair Encoding Pretokenizer for LLM Inference on Edge Devices
Liu Zai; Iraklis A. Klampanos

Annotation Entropy Predicts Per-Example Learning Dynamics in LoRA Fine-Tuning
Brady Steele

Semantic Contrastive Adaptation for Multimodal Figurative Language Understanding
Ayaan Siddiqui

Think Less, Code Better: Probing When Chain-of-Thought Hurts and How to Route Around It
Rajarshi Ghoshal; Salma Emad Mahmoud Abdelhalim; Debadri Basak; Pratibha kaur arora

Probing Functional Correctness in Diffusion Language Models
Guan-Ming Chiu; Jeng-Yue Liu

Thesis Proposal: Uncertainty as Adaptive Control: From Selection to Curriculum via Conformal Calibration
Peihong Li; Yan Yan

Thesis Proposal: On the Granularity-Robustness Trade-off in Text-Derived Knowledge Graphs
Surawat Pralomram

TokLens: A Multilingual Lens on Tokenizer Quality for LLMs
Guan-Ming Chiu

Phase Transitions in Affective Meaning Divergence: The Hidden Drift Before the Break
Napassorn Litchiowong

Sycophantic Anchors: Localizing and Quantifying User Agreement in Reasoning Models
Jacek Duszenko; Przemyslaw Kazienko; Jan Kocon

NEAT-IR: Neural Explainable Analysis Tool for Information Retrieval
Lev Sukherman; Artem Frenk; Nina Klimenkova; Connor Jason

BanglaSocialBench: A Benchmark for Evaluating Sociopragmatic and Cultural Alignment of LLMs in Bangladeshi Social Interaction
Tanvir Ahmed Sijan; S. M Golam Rifat; Pankaj Chowdhury Partha; Md. Tanjeed Islam; Md Musfique Anwar

Interpretability of LLM Classifiers via the Rational Inattention Theory with Application to Hate Speech Detection
Yuan Zhao; Ali Abdi

The Shape of Vulnerability: How Adversarial Perturbations Reshape the Topology of Language Model Latent Spaces
Angelina Tsai; Shreya Subramanian; Catherine Liu; Kimberly Lopez; Leif Zinn-Brooks; Alexia Schulz; Adaku Uchendu

LLM-based Literal Example Generation for Japanese Multiword Expressions
Mio Ohashi; Hajime Kiyama; Zhidong Ling; Mamoru Komachi

Presentation Slide Translation and Layout Error Correction by LLMs
Futo Kajita; Nobuyori Nishimura; Takehito Utsuro; Naoki Muto; Chee Siang Leow; Hiromitsu Nishizaki

Constructing a Japanese Rap Lyric Generation Model with GRPO
Hayato Ogawa; Daisuke Kawahara

Tracking the Evolution of Foresight Signals in News Data: The Case of the European Electric Vehicle Market
Karine Navasartian

Cultural Value Alignment Via Latent Activation Steering in Large Language Models
Trung Duc Anh Dang; Sarah Masud

Debiasing Logical Fallacy Detection for Real-World Robustness via Counterfactually Augmented Data
Navyansh Singh

Thesis Proposal: Bring Linguistics Back to Cryptanalysis - Using Attestation to Break the Advanced Encryption Standard
Madeline Boese

Garden Path Recovery in Causal and Masked Language Models
Sanjan Baitalik; Rajashik Datta

Confidence as a Tie-Breaker: Reassessing Multilingual Hedging Bias in LLM-as-a-Judge Evaluation
Rajashik Datta; Sanjan Baitalik

BanglaSTEM: A Parallel Corpus and Term-Weighted Evaluation for Technical Bangla-English Translation
Kazi Reyazul Hasan; A. B. M. Alim Al Islam; Muhammad Abdullah Adnan

Believing is Seeing: How Token Inflation Mechanistically Erodes Theory of Mind in Large Language Models
Zhizhi Wang; Ruochen Zhang

Disentangling the Effects of Unlearning in Measuring Parametric Faithfulness of Chain-of-Thought
Ryo Mitsuhashi; Gaku Morio; Ayana Niwa; Masahiro Kaneko; Kentaro Inui; Terufumi Morishita; Yuta Koreeda; Yasuhiro Sogawa

FedPAGR: Federated Prototype Alignment via Geometric Refinement for Heterogeneous Architectures
Kris Prasad; Md Abdullah Al Hafiz Khan

Sentiment Analysis of Yelp Review Dataset: A Comparative Study of Machine Learning Methods
Krishna Thakar; Mohamed Abu Sheha; Emmanuel Thompson

Semantic Span Annotation: An Exploratory Study of LLM Annnotation
Tejas Goyal; Dhriti Krishnan; Anuj Gupta; Jaromir Savelka

Thesis Proposal: An Explainable Multimodal Framework for Detecting Harmful Content in Code-Switched Children’s Media
Juliana Isabelle A. Guillermo; Jasper Kyle Catapang; Nathaniel Oco

Test-Time Strategies for More Efficient and Accurate Agentic RAG
Abhinav Sharma; Brian Zhang; Deepti Guntur; Zhiyang Zuo; Shreyas Chaudhari; Wenlong Zhao; Franck Dernoncourt; Puneet Mathur; Ryan A. Rossi; Nedim Lipka

Eye Movement Features Can Predict Human Preferences on Machine-Generated Texts
Xiaoshan He; Xiaoqun Liu; Haodong He; Yu Wang; Yang Xu

Thesis Proposal: Diagnosing and Mitigating Semantic Interference in Script-Sharing Low-Resource Language Models: A Case Study on Square Bai Script
Jingting Zheng; Deyi Xiong

Does Locality Cost in Polish Medical Text Classification? Duplicate-Aware Evaluation of Federated Learning
Daniel Cieślak; Andrzej Czyżewski

Analyzing Hate Speech Amplification on Fringe Platforms
Anika Ghosh Basu

The Silence of the Facts: Popularity as a Barrier to Machine Unlearning
Anna Borisiuk; Andrey Savchenko; Alexander Panchenko; Elena Tutubalina

Leakage-Aware User-Level ADHD Signal Classification from Social Media: When Graph Aggregation Helps, and When It Does Not
Daniel Cieślak; Władysław Średniawa

CAL-Log: Cost-Aware Active Learning with Logarithmic Cognitive Effort Modeling and Online Adaptation to Human Annotation Behavior
Vihanga Supasan Kariyakaranage; Banuka Athuraliya

Thesis Proposal: Targeted and Unified Cross-Lingual Unlearning from Multilingual Language Models
Jan Bronec; Jindřich Helcl

A11y-Compressor: A Framework for Enhancing the Efficiency of GUI Agent Observations through Visual Context Reconstruction and Redundancy Reduction
Michito Takeshita; Takuro Kawada; Takumi Ohashi; Shunsuke Kitada; Hitoshi Iyatomi

Beyond Static Cropping: Layer-Adaptive Visual Localization and Decoding Enhancement
Zipeng Zhu; Zhanghao Hu; Qinglin Zhu; Jingyong Su; Yulan He; Lin Gui

Counterspeech Generation using Small Language Models
Abubakar Sadiq Muhammad; Simona Frenda; Gavin Abercrombie

From Graphs to Hypergraphs: Enhancing Aspect-Term Sentiment Analysis via Multi-Level Relational Modeling
Omkar Mahesh Kashyap; Padegal Amit; Madhav Kashyap; Ashwini M Joshi; Shylaja S S

Probing Bias Formation in Medical LLMs through Activation Steering
Bayram Ayadi; Annette Hautli-Janisz

Faithfulness Beyond Plausibility: Auditing Human Explanations in Educational Assessment
Ria Talsania; Dhruv Ritesh Shah; Sudhir Dhage

CBAL: Context-Based Agentic Learning for Speaker Diarization Segmentation Refinement
Odwitiyo Dutta; Dinesh K Vishwakarma

Measuring and Mitigating Shortcut Reliance in Language Models with Probe-Based Representation Entanglement
Divyajot Singh

LAMP-MedQA: A Lightweight Multi-Agent System for Patient-Oriented Medical Question Answering
Jack A. Johnson; Meghali Banerjee; Joseph Crawford; James Welch; Jim Davies; Tingyan Wang

Inference-Time Feedback for Reasoning Controllability in Diffusion Language Models
Clovis Barbour; Huixin Zhan

Disentangling Linguistic Relatedness from Task Alignment in Cross-Lingual Transfer
Ahmed Haj Ahmed; Ruochen Zhang; Alvin C Grissom II

PE-QAT: Parameter-Efficient Quantization-Aware Training for Large Language Models
Shresth Mishra

Fusion Training for Mathematical Generalization in Large Language Models
Congfeng Cao; Pengyu Zhang; Jelke Bloem

Does Topic Sentiment Cause Perceived Ideology? Comparing Human and LLM Annotations in Political News Articles
Upasana Chatterjee

Understanding Conversational Implicatures in Humans and LLMs
Daeun Kang

Thesis Proposal: Self-Adaptive and Epistemic Uncertainty-Guided ASR of Dense Intra-Sentential Code-Switched Speech for African Low-Resource Languages
Umar Baba Umar

RegTrack: A Fine-Grained Benchmark for Multi-Class Legal Change Detection
Joe Yu; Kevin Chenhao Li; Julian Ostarek

Validator-Guided Hard Negative Mining for Masked Language Modeling in Low-Resource Ancient Languages
Andrei Voinea

Conformal LLM Routing with Distribution-Free Safety Guarantees
Iqtedar Uddin; André Bauer

Supervision versus Demonstration-Based In-Context Learning for Multiword Expression Classification
Sercan Karakas; Yusuf ŞİMŞEK

When Models Hesitate: Answer Instability as a Label-Free Uncertainty Signal for LLMs
Jasper Meynard Arana; Kristine Ann M. Carandang; Ethan Robert Casin; Christian Alis; Christopher Monterola

HARP: Representation-Based Preference Learning for Perceptual Data
Jordan Sinclair; Yousra Shleibik; Kerstin Haring

Thesis Proposal: Toward a Human-Centered and Perspective-Aware Framework for Reproducible ML Evaluation and AI Alignment
Deepak Pandita; Christopher M Homan

One Panel Does Not Fit All: Case-Adaptive Multi-Agent Deliberation for Clinical Prediction
Yuxing Lu; Yushuhong Lin; Jason Zhang

LLMs for Now, Fine-Tuning for Later: An Ensemble Approach to Data Drift in Domain-Specific Tasks
Yuxuan Lu; Bingsheng Yao; Shao Zhang; Yisi Sang; Yun Wang; Hansu Gu; Peng Zhang; Tun Lu; Toby Jia-Jun Li; Dakuo Wang

Thesis Proposal: When Does an Agent Know It Is Lost? Confidence Trajectory Analysis for Tool-Using LLMs
Zhenjiang Mao

Task Assignment meets Annotator Modeling: Human-LLM Collaborative Annotation with Constraints
Kei Moriyama; Kouta Nakayama; Yukino Baba

Dynamic Meta-Metrics: Source-Sentence Conditioned Weighting for MT Evaluation
Luke Zhang; Justin Vasselli; Aditya Khan; York Hay Ng; En-Shiun Annie Lee

Tonal Salience in Cognitive Decline: In-Context MCI Detection with Multimodal LLMs
Christopher Song; Abdullah P. Rashed Ahmed

What Moves the Pareto Frontier in Tool-Using Agents? A Compute-Aware Study of ReAct Variants
Rishi N. Simhadri

Filling the Long Tail: Structure-Aware Curriculum-Gap Completion for Medical Education with LLMs
Wenjie Lin

Mechanistic Analysis Of Universality: Numerical Comparison Circuits Across Transformer Architectures
Arya Bhardia; Julian Ramirez; Siddhanta Verma; Karen Mkrtchyan

How Hard is Math? Using Quantitative Metrics to Measure LLM Alignment to Human Intuitions of Difficulty
Micah Helzerman; Steven R Wilson; Cam McLeman

Fine-Grained Semantic Comparison of Legal Documents using LLMs
Elisei Rykov; Nikolay Ivanov; Maria Bandulevich; Kseniia Petrushina; Valentin Malykh; Vasily Konovalov; Alexander Panchenko; Ilseyar Alimova

From Fluent to Useful: Generative AI That Models Purpose, Audience, and Presenter for Scientific Communication
Ishani Mondal

Beyond Discrete Search: Divergent Thinking as Intention Optimization in Latent Space
Mateusz Bystroński; Grzegorz Piotrowski; Tomasz Jan Kajdanowicz

Boosting Self-Consistency with Ranking
Maria Marina; Daniil Moskovskiy; Sergey Pletenev; Mikhail Salnikov; Alexander Panchenko; Viktor Moskvoretskii

LLM-Based Zero-Shot Soft Labeling for Anticipating Disagreement in Negotiation Dialogues
Ken Watanabe; Katsuhide Fujita

Analysis of the Neglect-Zero Effect in Large Language Models
Jin Tanaka; Daiki Matsuoka; Ryoma Kumon; Hitomi Yanaka

Morphology-Aware Multi-Granularity Representation Learning for Agglutinative Languages
zhonghao zhang; NA LIU; Jiajia Ma; Nier Wu; Guiping Liu

An Incremental CYK Recognizer for GPU-accelerated General Context-free Prefix Validation
Jiacheng Zhang; Ayesha Khatun; Steven Bethard

Processing Inconsistency Predicts Language Competence: LLM Evaluation Without Answer Labels on Turkic Languages
Ilya Galyukshev; Ilseyar Alimova

TableMBR: Minimum Bayes Risk Table Generation Based on Structural Consistency
Yoshida Daiki; Hiroyuki Deguchi; Yusuke Sakai; Hidetaka Kamigaito; Taro Watanabe

Through the Looking Glass of Multilingual AI: Contrasting Language- and Name Script-Dependent Ethnic Hierarchies in GPT and DeepSeek
Jonathan Sakunkoo; Annabella Sakunkoo

Thesis Proposal: Auditing and Mitigating Demographic Bias in Multi-Stage Retrieval Systems for Criminal Justice Applications
Archan Dutta

Contextual Diversity Measure (CDM) for Controllable Story Generation in Large Language Models
Richard Susilo; Hanna Suominen; Patrik Haslum

Constructing a Japanese Verdict Prediction Dataset for Fact-Checking of LLM-Generated Texts
Miwa Masano; Hirokazu Kiyomaru; Atsushi Keyaki; Kaito Horio; Rei Minamoto; Ribeka Keyaki; Kouta Nakayama; Hideyuki Tachibana; Daisuke Kawahara

Calibration vs Decision Making: Revisiting the Reliability Paradox in Unlearned Language Models
Divyaksh Shukla; Ashutosh Modi

Disentangling Meaning and Language Components in Diverse Multilingual Sentence Embeddings
Kanade Nonomura; Keita Fukushima; Risa Kondo; Tomoyuki Kajiwara

Linguistically-Informed Evaluation of LLMs on Acceptability Judgments in a Forced-Choice Paradigm
Ziyue Liu; Nils Reiter

Representing Lean Proofs as Trajectories in Latent Space
Elisaveta Samoylov; Soroush Vosoughi

Evaluation of Multilingual Ability to Use Spatial Deictic Expressions in Vision-Language Models
Kaito Watanabe; Taisei Yamamoto; Tomoki Doi; Hitomi Yanaka

LLM Parameters for Math Across Languages: Shared or Separate?
Behzad Shomali; Luisa Victor; Tim Selbach; Ali Hamza Bashir; David Berghaus; Joachim Koehler; Mehdi Ali; Markus Frey

Thesis Proposal: Intentional Inference for Insight Generation
Kristýna Onderková

Reference-Free Schema Generation for Literature Review Tables via Multi-Faceted Rewards
Sinjoy Saha; Suman Saha; Mahfuza Farooque; Wenpeng Yin

Factual State Discovery Benchmark: Evaluating Fact Elicitation in Polish Tax Law
Mateusz Bystroński; Kamil Tagowski; Denis Janiak; Julia Farganus; Lukasz Augustyniak; Monika Kajdanowicz; Tomasz Jan Kajdanowicz

Evolutionary Search for Automated Design of Uncertainty Quantification Methods
Mikhail Seleznyov; Daniil Korbut; Viktor Moskvoretskii; Oleg Somov; Alexander Panchenko; Elena Tutubalina

Thesis Proposal: A Normalization-First Framework for Sound, Complete, and Utility-Ready Open Information Extraction
Chandan Prakash; Pavan Kumar Chittimalli; Arnab Bhattacharya

Mind the Gap: Multilingual Divide in LLM Bias Detection and Reasoning
Medha Hira; Prachi Goyal; Raj Maheshwari; Arnav Goel

Multi-Constraint State Tracking with Negation: A Diagnostic Benchmark for LLM World Modeling
Ayan Sar; Pranav Singh Puri; Sumit Aich; Anurag Kaushish; Tanupriya Choudhury; Ajith Abraham

Learning Shortcut Models for Efficient Recursive Reasoning
Shiv Shankar

Convergent Demographic Utility Hierarchies: Geometry of Intersectional Values in LLMs
Pravish Sainath

CRL-Prompt: Contrastive and Reinforcement Learning for Soft Prompt Tuning for Text Classification
Danila Lapokin; Andrey Savchenko

Optimizing Packing and Shuffling Strategies for Enhanced Performance in Generative Language Models
Yanbing Chen; Ruilin Wang; Zihao Yang; Lavender Yao Jiang; Eric Karl Oermann

LLM as a Meta-Judge: Synthetic Data for NLP Evaluation Metric Validation
Lukáš Eigler; Jindřich Libovický; David Hurych

Continuous Context Sampling Allows Extending Diversity Boundaries of Large Language Models
Mateusz Bystroński; Doheon Han; Nitesh V. Chawla; Tomasz Jan Kajdanowicz

One Task Vector is not Enough: A Large-Scale Study for In-Context Learning
Pavel Tikhonov; Ivan Oseledets; Elena Tutubalina

Non-archival

RAQE: Reranker-Aligned Query Expansion via Label-Free Group-Relative Policy Optimization
Gyeonghun Sun; Jeonghwan Choi; Hwanjun Song

Neural KWIC: Inducing Contextualized Word Embeddings from KWIC Concordance Examples
Mao Shimada; Hajime Kiyama; Zhidong Ling; Mamoru Komachi; Toshinobu Ogiso; Hiroya Takamura; Daichi Mochihashi

Thesis Proposal: Establishing Rigorous Evaluation of Sycophancy in Pretrained Language Models
Jan Batzner

Identifying the Convergent Sycophancy Gap in Model Evaluations
Jan Batzner; Volker Stocker; Stefan Schmid; Gjergji Kasneci

Understanding Clinical Cognitive Dialogues Using Large Language Models
Vishalakshi Arumugam; Dan Schumacher; Veronica Rammouz; Enrique Gonzalez; Jeremy J Davis; Anthony Rios

MetaCog-Bench: Quantifying the Metacognition Gap in Edge LLM Tool Calling Under Information Insufficiency
Yu-An Lu; Chun-En Hsiao; Chengwei Chiang; Hong-Han Shuai

EnsemHalDet: Robust VLM Hallucination Detection via Ensemble of Internal State Detectors
Ryuhei Miyazato; Shunsuke Kitada; Kei Harada

Thesis Proposal: Sensitivity of MT Evaluation Metrics to Semantic Errors: A Case Study on Swedish–Finnish Translation
Nuo Xu

RECON: Benchmarking Agent Memory for Compositional Reasoning over Long Contexts
Mihir Shriniwas Arya

Thesis Proposal: Rethinking Safety Evaluation in Large Language Models
Khaoula Chehbouni

Dissociating Circuit-Level and Distribution-Level Effects of Knowledge Conflicts in LLMs
Pravish Sainath

The signal is coming from inside the noun phrase! Tracking semantic proto-role inferences during sentence processing
Lucas Y. Li; Zander Lynch; Marten Van Schijndel

The Confident Liar: Diagnosing Multi-Agent Debate with Log-Probabilities and LLM-as-Judge
Ali Keramati; Justin Cheok; Jacob Horne; Mark Warschauer

Adaptive Data Collection for Latin-American Community-sourced Evaluation of Stereotypes (LACES)
Guido Ivetta; Pietro Palombini; Sofía Martinelli; Marcos J Gomez; M Emilia Echeveste; Sunipa Dev; Vinodkumar Prabhakaran; Luciana Benotti

Metadata Conditioned Large Language Models for Localization
Anjishnu Mukherjee; Ziwei Zhu; Antonios Anastasopoulos

Think Anywhere in Code Generation
Xue Jiang; Tianyu Zhang; Ge Li; Mengyang Liu; Taozhi Chen; Zhenhua Xu; Wenpin Jiao; Zhi Jin; Yihong Dong