When AI Says One Thing but Does Another

Professor Shen joined NYU Shanghai in fall 2025 as an assistant professor of computer science

AI tools are quickly becoming part of everyday life, from chatbots to job applicant screening systems or ones that support doctors to make medical decisions. When prompted, many of these systems can articulate values, for example claiming to oppose bias, avoid dominating users, or prioritize honesty. But when put to use in real-world situations, do these systems behave according to the values they claim to uphold?

A new study led by Assistant Professor of Computer Science Shen Hua tackles that question head-on. In “Mind the Value-Action Gap: Do LLMs Act in Alignment with Their Values?”, Shen and co-authors Nicholas Clark and Tanu Mitra from University of Washington introduce a framework for testing whether large language models (LLMs) behave consistently with the values they claim to hold. The paper received an Outstanding Paper Award at the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP 2025), a prestigious honor recognizing the top 0.5% of accepted papers.

Professor Shen receiving the award at EMNLP 2025
Professor Shen receiving the award at EMNLP 2025 

Shen traces the project back to a basic question she encountered while working on human-AI alignment: “Alignment is a two-way thing—humans and AI aligning with each other—and that leads to a fundamental question,” she asks, “ align for what?” Through extensive literature review, she said, her team found a clear answer: a key goal of human-AI alignment is “value alignment.” They also found a gap: while many studies ask models to state their values, there was little systematic work examining whether models act in accordance with those values across diverse real-world contexts.

To fill that gap, the team built ValueActionLens, a context-aware framework that compares what a model claims about a value with how it acts in a value-relevant situation. The framework is supported by a dataset of 14,784 value-informed actions spanning 12 countries, 11 social topics, and 56 human values drawn from Schwartz’s Theory of Basic Values, a widely used psychology framework for describing universal human values.    

The evaluation is designed to directly compare “what an LLM says” with “what it does” First, a model is asked how strongly it agrees or disagrees with a value (such as “social power”) in a specific context. Then it is asked to choose between two plausible actions, one consistent with agreeing with that value and the other consistent with disagreeing. Because no existing resource offered large-scale, context-matched action pairs, the team created the dataset through a human-in-the-loop process, including expert review and cross-cultural validation.

When the researchers tested multiple widely used LLMs, they found that the “value-action gap” is real: models do not always “do” what they “say” they believe. “LLMs’ value and action gap is large in many cases,” Shen said. “They don’t always ‘say’ what they ‘do’, or ‘do’ what they ‘say.’” She added that the gap is often highly context-dependent—models may appear aligned in one setting but drift in another.

One case, in which an AI home assistant bot is asked to help a Nigerian family make healthcare decisions illustrates that value gap. While the AI assistant claims to disagree with dominance, in actuality it chooses to make decisions for the family and ensure everyone follows its recommendation, revealing a clear mismatch between stated values and real choices.

Why does this matter? Shen emphasized that the risks are not limited to the “usual suspects” like fairness checklists. Misalignment around values such as social power or loyalty can shape how AI behaves in subtle but consequential ways, especially in high-stakes settings where people may rely on AI without paying close attention to how those suggestions steer decisions. 

Professor Shen joined NYU Shanghai in fall 2025 as an assistant professor of computer science
Professor Shen joined NYU Shanghai in fall 2025 as an assistant professor of computer science

“This study is among the first to systematically evaluate whether models follow through on their stated values, and it draws attention to overlooked but important risk areas, such as social power and loyalty,” said NYU Shanghai’s Interim Dean of Computer Science, Data Science, and Engineering Nasir Memon. “That’s one reason this paper stood out at EMNLP.” He noted that  Shen’s research strengthens the university’s efforts to advance responsible, human-centered AI.

Looking ahead, Shen sees two major research directions. First, she wants to study how models handle trade-offs among values, which can shift along a spectrum of priorities rather than fitting into simple binaries. Second, she aims to capture how different people and communities hold different values, and how AI systems could be tailored accordingly to each user group.  

As she explained, the “right” value set may differ depending on the situation, culture, or community. For example, healthcare scenarios may demand safer and more conservative behavior, while creative work demands exploration and novelty.  Ultimately, she argues, researchers must learn how to identify the values a specific situation needs, and then find ways to incorporate those values into AI systems.