Synthetic Diversity in Academic Paper Reviews with LLMs

Ruoxi Shang , David McDonald , Chirag Shah , Gary Hsieh

In Revision for CHI 2025

Abstract

Obtaining diverse expert feedback on academic research is valuable yet challenging. Large Language Models (LLMs) show promise in simulating varied perspectives and generating paper reviews, but perceptions of synthetic diverse research feedback remain under-explored. This study investigates how researchers perceive LLM-generated reviews compared to human reviews. We generated synthetic diverse reviews for participants’ papers and conducted a mixed-methods study with 18 experienced researchers. Participants recognized synthetic diversity in the reviews’ expertise and attitudinal stances, along with benefits in uncovering blind spots, identifying critical issues, and enhancing willingness to improve. We found differences between perceptions of LLM-generated versus human reviews in degree of diversity, homogeneity, authenticity of expertise, and divergent opinions. Our findings offer insights into LLM’s role in academic discourse and inform guidelines on generating meaningful LLM-augmented diverse feedback. We also contribute a dataset of over 800 sentence-level annotations from 54 synthetic and 62 human reviews to facilitate future research.

Project Slides