Samuel D. Marks
YOU?
Author Swipe
View article: Heterogenous Dynamics in a Polymer Solution Revealed through Measurement of Ultraslow Convection
Heterogenous Dynamics in a Polymer Solution Revealed through Measurement of Ultraslow Convection Open
Understanding solution-phase aggregation and dynamics in complex fluids is critical for material processing, yet widely used dynamic light scattering (DLS) fails for strongly attenuating systems such as conjugated polymers. We use X-ray ph…
View article: Eliciting Secret Knowledge from Language Models
Eliciting Secret Knowledge from Language Models Open
We study secret elicitation: discovering knowledge that an AI possesses but does not explicitly verbalize. As a testbed, we train three families of large language models (LLMs) to possess specific knowledge that they apply downstream but d…
View article: The Quest for the Right Mediator: Surveying Mechanistic Interpretability for NLP Through the Lens of Causal Mediation Analysis
The Quest for the Right Mediator: Surveying Mechanistic Interpretability for NLP Through the Lens of Causal Mediation Analysis Open
Interpretability provides a toolset for understanding how and why language models behave in certain ways. However, there is little unity in the field: most studies employ ad-hoc evaluations and do not share theoretical foundations, making …
View article: Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning
Steering Out-of-Distribution Generalization with Concept Ablation Fine-Tuning Open
Fine-tuning large language models (LLMs) can lead to unintended out-of-distribution generalization. Standard approaches to this problem rely on modifying training data, for example by adding data that better specify the intended generaliza…
View article: Auditing language models for hidden objectives
Auditing language models for hidden objectives Open
We study the feasibility of conducting alignment audits: investigations into whether models have undesired objectives. As a testbed, we train a language model with a hidden objective. Our training pipeline first teaches the model about exp…
View article: SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability
SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability Open
Sparse autoencoders (SAEs) are a popular technique for interpreting language model activations, and there is extensive recent work on improving SAE effectiveness. However, most prior work evaluates progress using unsupervised proxy metrics…
View article: Unveiling the Mechanism of Mn Dissolution Through a Dynamic Cathode‐Electrolyte Interphase on LiMn<sub>2</sub>O<sub>4</sub>
Unveiling the Mechanism of Mn Dissolution Through a Dynamic Cathode‐Electrolyte Interphase on LiMn<sub>2</sub>O<sub>4</sub> Open
Understanding the formation and evolution of the cathode‐electrolyte interphase (CEI), which forms at the interface between the cathode and electrolyte, is crucial for revealing degradation mechanisms in cathode materials, especially for d…
View article: Microstructure-Dependent Sodium Storage Mechanisms in Hard Carbon Anodes
Microstructure-Dependent Sodium Storage Mechanisms in Hard Carbon Anodes Open
Hard carbon (HC) is a leading anode material for sodium-ion batteries, but its complex microstructure complicates understanding of sodium storage mechanisms. Using X-ray total scattering and density functional theory calculations, this stu…
View article: In Situ Characterization of the Oxidation Behavior of Carbonate-Based Electrolytes for Lithium-Ion Batteries by Scanning Electrochemical Microscopy
In Situ Characterization of the Oxidation Behavior of Carbonate-Based Electrolytes for Lithium-Ion Batteries by Scanning Electrochemical Microscopy Open
Lithium-ion batteries (LIBs) have been widely employed as energy storage devices in portable electronics and electric vehicles. Many processes occurring at the electrode/electrolyte interphases lead to performance degradation over time and…
View article: Evaluating Sparse Autoencoders on Targeted Concept Erasure Tasks
Evaluating Sparse Autoencoders on Targeted Concept Erasure Tasks Open
Sparse Autoencoders (SAEs) are an interpretability technique aimed at decomposing neural network activations into interpretable units. However, a major bottleneck for SAE development has been the lack of high-quality performance metrics, w…
View article: Erasing Conceptual Knowledge from Language Models
Erasing Conceptual Knowledge from Language Models Open
In this work, we introduce Erasure of Language Memory (ELM), a principled approach to concept-level unlearning that operates by matching distributions defined by the model's own introspective classification capabilities. Our key insight is…
View article: The Quest for the Right Mediator: Surveying Mechanistic Interpretability Through the Lens of Causal Mediation Analysis
The Quest for the Right Mediator: Surveying Mechanistic Interpretability Through the Lens of Causal Mediation Analysis Open
Interpretability provides a toolset for understanding how and why neural networks behave in certain ways. However, there is little unity in the field: most studies employ ad-hoc evaluations and do not share theoretical foundations, making …
View article: Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models
Measuring Progress in Dictionary Learning for Language Model Interpretability with Board Game Models Open
What latent features are encoded in language model (LM) representations? Recent work on training sparse autoencoders (SAEs) to disentangle interpretable features in LM representations has shown significant promise. However, evaluating the …
View article: Optical and electronic functionality arising from controlled defect formation in nanoscale complex oxide lateral epitaxy
Optical and electronic functionality arising from controlled defect formation in nanoscale complex oxide lateral epitaxy Open
Epitaxial crystallization of complex oxides provides the means to create materials with precisely selected composition, strain, and orientation, thereby controlling their functionalities. Extending this control to nanoscale three-dimension…
View article: NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals
NNsight and NDIF: Democratizing Access to Open-Weight Foundation Model Internals Open
We introduce NNsight and NDIF, technologies that work in tandem to enable scientific study of the representations and computations learned by very large neural networks. NNsight is an open-source system that extends PyTorch to introduce de…
View article: Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models Open
We introduce methods for discovering and applying sparse feature circuits. These are causally implicated subnetworks of human-interpretable features for explaining language model behaviors. Circuits identified in prior work consist of poly…
View article: Resolving Length Scale Dependent Transient Disorder Through an Ultrafast Phase Transition
Resolving Length Scale Dependent Transient Disorder Through an Ultrafast Phase Transition Open
Material functionality can be strongly determined by structure extending only over nanoscale distances. The pair distribution function presents an opportunity to shift structural studies beyond idealized crystal models and investigate stru…
View article: Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback Open
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals. RLHF has emerged as the central method used to finetune state-of-the-art large language models (LLMs). Despite this popular…
View article: MOCVD of InGaN on ScAlMgO4 on Al2O3 Substrates with Improved Surface Morphology and Crystallinity
MOCVD of InGaN on ScAlMgO4 on Al2O3 Substrates with Improved Surface Morphology and Crystallinity Open
ScAlMgO4 (SAM) is a promising substrate material for group III-nitride semiconductors. SAM has a lower lattice mismatch with III-nitride materials compared to conventionally used sapphire (Al2O3) and silicon substrates. Bulk SAM substrate …
View article: Ultrafast Bragg Coherent Diffraction Imaging of Epitaxial Thin Films using Deep Complex-valued Neural Networks
Ultrafast Bragg Coherent Diffraction Imaging of Epitaxial Thin Films using Deep Complex-valued Neural Networks Open
Domain wall structures form spontaneously due to epitaxial misfit during thin film growth. Imaging the dynamics of domains and domain walls at ultrafast timescales can provide fundamental clues to features that impact electrical transport …
View article: Subpicosecond Optical Stress Generation in Multiferroic BiFeO<sub>3</sub>
Subpicosecond Optical Stress Generation in Multiferroic BiFeO<sub>3</sub> Open
Optical excitation leads to ultrafast stress generation in the prototypical multiferroic BiFeO3. The time scales of stress generation are set by the dynamics of the population of excited electronic states and the coupling of the electronic…
View article: Structural Evidence for Ultrafast Polarization Rotation in Ferroelectric/Dielectric Superlattice Nanodomains
Structural Evidence for Ultrafast Polarization Rotation in Ferroelectric/Dielectric Superlattice Nanodomains Open
Weakly coupled ferroelectric/dielectric superlattice thin film heterostructures exhibit complex nanoscale polarization configurations that arise from a balance of competing electrostatic, elastic, and domain-wall contributions to the free …
View article: Instrument for <i>in situ</i> hard x-ray nanobeam characterization during epitaxial crystallization and materials transformations
Instrument for <i>in situ</i> hard x-ray nanobeam characterization during epitaxial crystallization and materials transformations Open
Solid-phase epitaxy (SPE) and other three-dimensional epitaxial crystallization processes pose challenging structural and chemical characterization problems. The concentration of defects, the spatial distribution of elastic strain, and the…
View article: Instrument for in situ hard x-ray nanobeam characterization during epitaxial crystallization and materials transformations
Instrument for in situ hard x-ray nanobeam characterization during epitaxial crystallization and materials transformations Open
Solid-phase epitaxy (SPE) and other three-dimensional epitaxial crystallization processes pose challenging structural and chemical characterization problems. The concentration of defects, the spatial distribution of elastic strain, and the…
View article: Resonant nanodiffraction x-ray imaging reveals role of magnetic domains in complex oxide spin caloritronics
Resonant nanodiffraction x-ray imaging reveals role of magnetic domains in complex oxide spin caloritronics Open
X-ray nanodiffraction reveals the magnetism of spincaloritronic oxides and how it can be controlled using lattice distortion.