Shahar Levy
YOU?
Author Swipe
View article: ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of Moments
ReliableEval: A Recipe for Stochastic LLM Evaluation via Method of Moments Open
LLMs are highly sensitive to prompt phrasing, yet standard benchmarks typically report performance using a single prompt, raising concerns about the reliability of such evaluations. In this work, we argue for a stochastic method of moments…
View article: More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG
More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG Open
Retrieval-Augmented Generation (RAG) enhances the accuracy of Large Language Model (LLM) responses by leveraging relevant external documents during generation. Although previous studies noted that retrieving many documents can degrade perf…
View article: SEAM: A Stochastic Benchmark for Multi-Document Tasks
SEAM: A Stochastic Benchmark for Multi-Document Tasks Open
Various tasks, such as summarization, multi-hop question answering, or coreference resolution, are naturally phrased over collections of real-world documents. Such tasks present a unique set of challenges, revolving around the lack of cohe…
View article: Collecting a Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation
Collecting a Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation Open
Recent works have found evidence of gender bias in models of machine translation and coreference resolution using mostly synthetic diagnostic datasets. While these quantify bias in a controlled experiment, they often do so on a small scale…
View article: Collecting a Large-Scale Gender Bias Dataset for Coreference Resolution\n and Machine Translation
Collecting a Large-Scale Gender Bias Dataset for Coreference Resolution\n and Machine Translation Open
Recent works have found evidence of gender bias in models of machine\ntranslation and coreference resolution using mostly synthetic diagnostic\ndatasets. While these quantify bias in a controlled experiment, they often do\nso on a small sc…
View article: Collecting a Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation
Collecting a Large-Scale Gender Bias Dataset for Coreference Resolution and Machine Translation Open
Recent works have found evidence of gender bias in models of machine translation and coreference resolution using mostly synthetic diagnostic datasets. While these quantify bias in a controlled experiment, they often do so on a small scale…
View article: Cell-type specific outcome representation in primary motor cortex
Cell-type specific outcome representation in primary motor cortex Open
Adaptive movements are critical to animal survival. To guide future actions, the brain monitors different outcomes, including achievement of movement and appetitive goals. The nature of outcome signals and their neuronal and network realiz…
View article: Rigorous Analytical Model for Metasurface Microscopic Design with Interlayer Coupling
Rigorous Analytical Model for Metasurface Microscopic Design with Interlayer Coupling Open
We present a semianalytical method for designing meta-atoms in multilayered metasurfaces (MSs), relying on a rigorous model developed for multielement metagratings. Notably, this model properly accounts for near-field coupling effects, all…
View article: Rigorous Analytical Model for Metasurface Microscopic Design with\n Interlayer Coupling
Rigorous Analytical Model for Metasurface Microscopic Design with\n Interlayer Coupling Open
We present a semianalytical method for designing meta-atoms in multilayered\nmetasurfaces (MSs), relying on a rigorous model developed for multielement\nmetagratings. Notably, this model properly accounts for near-field coupling\neffects, …