Seungwon Lim

Hi, I'm Seungwon Lim. I'm a researcher at Yonsei University, LangAGI (Language & AGI Lab) advised by Jinyoung Yeo. I received my bachelor's degree in Computer Science, and I am currently pursuing an integrated MS/PhD program in Computer Science.

Currently, I'm working as a research scientist intern at , Exaone Lab. I am conducting research for making advanced foundation large language models.

My research question centers on developing reliable agent systems. To achieve this, I am currently focusing on agent’s reasoning, action-decision, and human-centric AI.

Publications

VisEscape: A Benchmark for Evaluating Exploration-driven Decision-making in Virtual Escape Rooms

Seungwon Lim, Sungwoong Kim, Jihwan Yu, Sungjae Lee, Jiwan Chung, Youngjae Yu

EMNLP2025 Main

TLDR; We introduce VisEscape inspired by Escape Room games, and evaluate the Reasoning and Decision-making of diverse MLLMs in exploration-driven and dynamic environments.

Verifying the Verifiers: Unveiling Pitfalls and Potentials in Fact Verifiers

Wooseok Seo, Seungju Han, Jaehun Jung, Benjamin Newman, Seungwon Lim, Seungbeen Lee, Ximing Lu, Yejin Choi, Youngjae Yu

COLM2025

When AI co-scientists fail: SPOT-a benchmark for automated verification of scientific research

Guijin Son, Jiwoo Hong, Honglu Fan, Heejeong Nam, Hyunwoo Ko, Seungwon Lim, Jinyeop Song, Jinha Choi, Gonçalo Paulo, Youngjae Yu, Stella Biderman

Under Review

TLDR; We introduce SPOT, a benchmark for automated verification of scientific research, and show a substantial margin exists for AI-assisted academic verification.

Persona Dynamics: Unveiling the Impact of Persona Traits on Agents in Text-Based Games

Seungwon Lim, Seungbeen Lee, Dongjun Min, Youngjae Yu

ACL2025 Main (Oral)

TLDR; We introduce PANDA, which incorporates Human Personality Traits into AI agents for Text-based Games and examines how these traits impact their behavior and performance.

Do LLMs Have Distinct and Consistent Personality? TRAIT: Personality Testset designed for LLMs with Psychometrics

Seungwon Lim*, Seungbeen Lee*, Seungju Han, Giyeong Oh, Hyungjoo Chae, Jiwan Chung, Minju Kim, Beong-woo Kwak, Yeonsoo Lee, Dongha Lee, Jinyoung Yeo, Youngjae Yu

NAACL2025 Findings

TLDR; We introduce a psychometric-based benchmark TRAIT to measure the personality revealed in the Behavior Patterns of LLMs along with verification of Reliability and Validity.

MASS: Overcoming Language Bias in Image-Text Matching

Jiwan Chung, Seungwon Lim, Sangkyu Lee and Youngjae Yu

AAAI2025 Main

TLDR; We introduce MASS, a Training-free framework that improves Visual Accuracy and reduces Bias in Image-Text Matching for pretrained visual-language models.

Can visual language models resolve textual ambiguity with visual cues? Let visual puns tell you!

Jiwan Chung, Seungwon Lim, Jaehyun Jeon, Seungbeen Lee and Youngjae Yu

EMNLP2024 Main

TLDR; We introduce UNPIE, a new benchmark crafted to evaluate how multimodal inputs influence the Resolution of Lexical Ambiguities.

CLARA: Classifying and Disambiguating User Commands for Reliable Interactive Robotic Agents

Jeongeun Park, Seungwon Lim, Joonhyung Lee, Sangbeom Park, Minsuk Chang, Youngjae Yu and Sungjoon Choi

ICRA2024

TLDR; We introduce CLARA, a LLM-empowered method for robots to estimate Uncertainty of user commands and to Disambiguate them via question generation for clarification.