publications | Eric Bigelow

2025

Forking paths in neural text generation

Eric Bigelow, Ari Holtzman, Hidenori Tanaka, and Tomer Ullman

International Conference on Learning Representations (ICLR), 2025

arXiv
Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering

Eric Bigelow*, Daniel Wurgaft*, YingQiao Wang, Noah Goodman, Tomer Ullman, Hidenori Tanaka, and Ekdeep Singh Lubana

arXiv preprint arXiv:2511.00617, 2025

arXiv
People evaluate the algorithms that drive agents’ behavior

Eric Bigelow, and Tomer Ullman

Open Mind, 2025

PDF
Chain of Time: In-Context Physical Simulation with Image Generation Models

YingQiao Wang*, Eric Bigelow*, Boyi Li, and Tomer Ullman

arXiv preprint arXiv:2511.00110, 2025

arXiv
Evaluating Self-Orienting in Language and Reasoning Models

Eric Bigelow*, Zergham Ahmed*, and Tomer Ullman

ICML Workshop on Assessing World Models: Methods and Metrics for Evaluating Understanding, 2025

PDF
Let’s Simulate Frame-by-Frame: In-Context Physical Simulations with Vision-Language Models

Yingqiao Wang*, Eric Bigelow*, and Tomer Ullman

ICML Workshop on Assessing World Models: Methods and Metrics for Evaluating Understanding, 2025

PDF
Are language models aware of the road not taken? Token-level uncertainty and hidden state dynamics

Amir Zur, Atticus Geiger, Ekdeep Singh Lubana, and Eric Bigelow

ICML Workshop on Actionable Interpretability, 2025

arXiv
Priors in Time: Missing Inductive Biases for Language Model Interpretability

Ekdeep Singh Lubana, Can Rager, Sai Sumedh R. Hindupur, Valerie Costa, Greta Tuckute, Oam Patel, Sonia Krishna Murthy, Thomas Fel, Daniel Wurgaft, Eric Bigelow, Johnny Lin, Demba Ba, Martin Wattenberg, Fernanda Viegas, Melanie Weber, and Aaron Mueller

arXiv preprint arXiv:2511.01836, 2025

arXiv
Actual or counterfactual? Asymmetric responsibility attributions in language models

Eric Bigelow*, Yang Xiang*, Tobias Gerstenberg, Tomer Ullman, and Samuel J Gershman

In NeurIPS Workshop on Interpreting Cognition in Deep Learning Models (CogInterp), 2025
Language models assign responsibility based on actual rather than counterfactual contributions

Yang Xiang*, Eric Bigelow*, Tobias Gerstenberg, Tomer Ullman, and Samuel J Gershman

In Proceedings of the Annual Meeting of the Cognitive Science Society, 2025

PDF
On (Not) Seeing Your Self: Some People May Lack Third-Person Imagery

Eric Bigelow, and Tomer Ullman

In Society for Philosophy and Psychology, 2025
Emergence of Hierarchical Emotion Organization in Large Language Models

Bo Zhao, Maya Okawa, Eric Bigelow, Rose Yu, Tomer Ullman, Ekdeep Singh Lubana, and Hidenori Tanaka

arXiv preprint arXiv:2507.10599, 2025

arXiv

2024

In-Context Learning Dynamics with Random Binary Sequences

Eric Bigelow, Ekdeep Singh Lubana, Robert P Dick, Hidenori Tanaka, and Tomer Ullman

International Conference on Learning Representations (ICLR), 2024

arXiv
Foundational challenges in assuring alignment and safety of large language models

Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, ..., Eric Bigelow, ..., and David Krueger

Transactions in Machine Learning Research (TMLR), 2024

HTML

2023

Mechanistic mode connectivity

Ekdeep Singh Lubana, Eric Bigelow, Robert P Dick, David Krueger, and Hidenori Tanaka

International Conference on Machine Learning (ICML), 2023
Subjective Randomness and In-Context Learning

Eric Bigelow, Ekdeep Singh Lubana, Robert P Dick, Hidenori Tanaka, and Tomer Ullman

In NeurIPS Workshop on UniReps: Unifying Representations in Neural Models, 2023
Non-commitment in mental imagery

Eric Bigelow, John P McCoy, and Tomer Ullman

Cognition, 2023

HTML

2022

People’s evaluation of programs that drive agents’ behavior

Eric Bigelow, and Tomer Ullman

In Proceedings of the Annual Meeting of the Cognitive Science Society, 2022
Mechanistic Lens on Mode Connectivity

Ekdeep Singh Lubana, Eric Bigelow, Robert Dick, David Krueger, and Hidenori Tanaka

NeurIPS Workshop on Distribution Shifts: Connecting Methods and Applications, 2022
Opening the black box: People evaluate agents based on the algorithms that drive their behavior.

Eric Bigelow, and Tomer Ullman

In Society for Philosophy and Psychology, 2022

2016

Inferring priors in compositional cognitive models.

Eric Bigelow, and Steven T Piantadosi

Proceedings of the Annual Meeting of the Cognitive Science Society, 2016

PDF
A large dataset of generalization patterns in the number game

Eric Bigelow, and Steven Piantadosi

Journal of Open Psychology Data, 2016

PDF
Tales of two cities: Using social media to understand idiosyncratic lifestyles in distinctive metropolitan areas

Tianran Hu, Eric Bigelow, Jiebo Luo, and Henry Kautz

IEEE Transactions on Big Data, 2016

arXiv

2015

On the need for imagistic modeling in story understanding

Eric Bigelow, Daniel Scarafoni, Lenhart Schubert, and Alex Wilson

Biologically Inspired Cognitive Architectures, 2015

PDF