Steffi Chern
👋 Hi everyone! I’m a first-year Computer Science Ph.D. student at the University of Pennsylvania, advised by Eric Wong. I’m fortunate to receive the NSF Graduate Research Fellowship (GRFP) in 2025. Before that, I graduated with a B.S. in Statistics and Machine Learning from Carnegie Mellon University (CMU), where I was advised by Prof. Graham Neubig and Prof. Pengfei Liu.
🧠 My research interests lie at the intersection of natural language processing (NLP) and machine learning (ML). Specifically, I’m interested in the following topics:
- Superalignment: Ensuring that superintelligent AI systems autonomously adhere to human goals, values, and safety standards, especially as they surpass human cognitive understanding and operate in scenarios that are hard for humans to control. ScaleEval, OlympicArena, BeHonest.
- Trustworthy and Robust AI: Ensuring AI systems are reliable, honest, and safe by developing tools to detect factual inaccuracies, prevent hallucinations, adapt AI behavior to dynamic human norms, and strengthen robustness against adversarial attacks. BeHonest, FacTool, Align on the Fly, Halu-J, Combating Adversarial Attacks.
⛳ I was also a student-athlete, playing NCAA Women’s Golf at CMU.
📩 Feel free to contact me about any research/job opportunities or questions you have!
Academic Service
🔎 Reviewer: NeurIPS (2024, 2025), ICLR (2025), AISTATS (2025), COLM (2025)
Teaching
📚 Teaching Assistant for Probability Theory, Fall 2023
📚 Teaching Assistant for Advanced Methods for Data Analysis, Spring 2024
Publications
⬇️ Below are some of my recent publications:
Thinking with Generated Images
Ethan Chern, Zhulin Hu, Steffi Chern, Siqi Kou, Jiadi Su, Yan Ma, Zhijie Deng, Pengfei Liu
Preprint. [paper] [github]
BeHonest: Benchmarking Honesty in Large Language Models
Steffi Chern, Zhulin Hu, Yuqing Yang, Ethan Chern, Yuan Guo, Jiahe Jin, Binjie Wang, Pengfei Liu
Preprint. [paper] [github] [website]
OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI
Zhen Huang, Zengzhi Wang, Shijie Xia, Xuefeng Li, Haoyang Zou, Ruijie Xu, Run-Ze Fan, Lyumanshan Ye, Ethan Chern, Yixin Ye, Yikai Zhang, Yuqing Yang, Ting Wu, Binjie Wang, Shichao Sun, Yang Xiao, Yiyuan Li, Fan Zhou, Steffi Chern, Yiwei Qin, Yan Ma, Jiadi Su, Yixiu Liu, Yuxiang Zheng, Shaoting Zhang, Dahua Lin, Yu Qiao, Pengfei Liu
Accepted to the NeurIPS D&B Track, 2024 [paper] [github] [website]
Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate
Steffi Chern, Ethan Chern, Graham Neubig, Pengfei Liu
Accepted to the AAAI 2025 AI4Research (Oral) [paper] [github]
Halu-J: Critique-Based Hallucination Judge
Binjie Wang, Steffi Chern, Ethan Chern, Pengfei Liu
Accepted to the AAAI 2025 Workshop on Preventing and Detecting LLM Misinformation (Oral). [paper]
FacTool: Factuality Detection in Generative AI - A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
I-Chun Chern, Steffi Chern, Shiqi Chen, Weizhe Yuan, Kehua Feng, Chunting Zhou, Junxian He, Graham Neubig, Pengfei Liu
Accepted to COLM 2025. [paper] [github] [website]
Combating Adversarial Attacks with Multi‑Agent Debate
Steffi Chern *, Zhen Fan *, Andy Liu *
Preprint. [paper]
Align on the Fly: Adapting Chatbot Behavior to Established Norms
Chunpu Xu, Steffi Chern, Ethan Chern, Ge Zhang, Zekun Wang, Ruibo Liu, Jing Li, Jie Fu, Pengfei Liu
Preprint. [paper] [github]