Arjun R. Loomba
YOU?
Author Swipe
View article: SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models Open
Most of the existing Large Language Model (LLM) benchmarks on scientific problem reasoning focus on problems grounded in high-school subjects and are confined to elementary algebraic operations. To systematically examine the reasoning capa…