Eli Chien
YOU?
Author Swipe
View article: Differentially Private Relational Learning with Entity-level Privacy Guarantees
Differentially Private Relational Learning with Entity-level Privacy Guarantees Open
Learning with relational and network-structured data is increasingly vital in sensitive domains where protecting the privacy of individual entities is paramount. Differential Privacy (DP) offers a principled approach for quantifying privac…
View article: Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness
Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness Open
Machine unlearning techniques aim to mitigate unintended memorization in large language models (LLMs). However, existing approaches predominantly focus on the explicit removal of isolated facts, often overlooking latent inferential depende…
View article: Underestimated Privacy Risks for Minority Populations in Large Language Model Unlearning
Underestimated Privacy Risks for Minority Populations in Large Language Model Unlearning Open
Large Language Models (LLMs) embed sensitive, human-generated data, prompting the need for unlearning methods. Although certified unlearning offers strong privacy guarantees, its restrictive assumptions make it unsuitable for LLMs, giving …
View article: LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation
LayerDAG: A Layerwise Autoregressive Diffusion Model for Directed Acyclic Graph Generation Open
Directed acyclic graphs (DAGs) serve as crucial data representations in domains such as hardware synthesis and compiler/program optimization for computing systems. DAG generative models facilitate the creation of synthetic DAGs, which can …
View article: Privately Learning from Graphs with Applications in Fine-tuning Large Language Models
Privately Learning from Graphs with Applications in Fine-tuning Large Language Models Open
Graphs offer unique insights into relationships between entities, complementing data modalities like text and images and enabling AI models to extend their capabilities beyond traditional tasks. However, learning from graphs often involves…
View article: Differentially Private Graph Diffusion with Applications in Personalized PageRanks
Differentially Private Graph Diffusion with Applications in Personalized PageRanks Open
Graph diffusion, which iteratively propagates real-valued substances among the graph, is used in numerous graph/network-involved applications. However, releasing diffusion vectors may reveal sensitive linking information in the data such a…
View article: Multifaceted roles of cohesin in regulating transcriptional loops
Multifaceted roles of cohesin in regulating transcriptional loops Open
Cohesin is required for chromatin loop formation. However, its precise role in regulating gene transcription remains largely unknown. We investigated the relationship between cohesin and RNA Polymerase II (RNAPII) using single-molecule map…
View article: Certified Machine Unlearning via Noisy Stochastic Gradient Descent
Certified Machine Unlearning via Noisy Stochastic Gradient Descent Open
``The right to be forgotten'' ensured by laws for user data privacy becomes increasingly important. Machine unlearning aims to efficiently remove the effect of certain data points on the trained model parameters so that it can be approxima…
View article: Machine Unlearning of Pre-trained Large Language Models
Machine Unlearning of Pre-trained Large Language Models Open
This study investigates the concept of the `right to be forgotten' within the context of large language models (LLMs). We explore machine unlearning as a pivotal solution, with a focus on pre-trained models--a notably under-researched area…
View article: Langevin Unlearning: A New Perspective of Noisy Gradient Descent for Machine Unlearning
Langevin Unlearning: A New Perspective of Noisy Gradient Descent for Machine Unlearning Open
Machine unlearning has raised significant interest with the adoption of laws ensuring the ``right to be forgotten''. Researchers have provided a probabilistic notion of approximate unlearning under a similar definition of Differential Priv…
View article: Machine Unlearning of Pre-trained Large Language Models
Machine Unlearning of Pre-trained Large Language Models Open
The 62nd Annual Meeting of the Association for Computational Linguistics, Bangkok, Thailand, August 11-16, 2024
View article: Breaking the Trilemma of Privacy, Utility, Efficiency via Controllable Machine Unlearning
Breaking the Trilemma of Privacy, Utility, Efficiency via Controllable Machine Unlearning Open
Machine Unlearning (MU) algorithms have become increasingly critical due to the imperative adherence to data privacy regulations. The primary objective of MU is to erase the influence of specific data samples on a given model without the n…
View article: On the Inherent Privacy Properties of Discrete Denoising Diffusion Models
On the Inherent Privacy Properties of Discrete Denoising Diffusion Models Open
Privacy concerns have led to a surge in the creation of synthetic datasets, with diffusion models emerging as a promising avenue. Although prior studies have performed empirical evaluations on these models, there has been a gap in providin…
View article: Federated Classification in Hyperbolic Spaces via Secure Aggregation of Convex Hulls
Federated Classification in Hyperbolic Spaces via Secure Aggregation of Convex Hulls Open
Hierarchical and tree-like data sets arise in many applications, including language processing, graph data mining, phylogeny and genomics. It is known that tree-like data cannot be embedded into Euclidean spaces of finite dimension with sm…
View article: Differentially Private Decoupled Graph Convolutions for Multigranular Topology Protection
Differentially Private Decoupled Graph Convolutions for Multigranular Topology Protection Open
GNNs can inadvertently expose sensitive user information and interactions through their model predictions. To address these privacy concerns, Differential Privacy (DP) protocols are employed to control the trade-off between provable privac…
View article: Representer Point Selection for Explaining Regularized High-dimensional Models
Representer Point Selection for Explaining Regularized High-dimensional Models Open
We introduce a novel class of sample-based explanations we term high-dimensional representers, that can be used to explain the predictions of a regularized high-dimensional model in terms of importance weights for each of the training samp…
View article: PINA: Leveraging Side Information in eXtreme Multi-label Classification via Predicted Instance Neighborhood Aggregation
PINA: Leveraging Side Information in eXtreme Multi-label Classification via Predicted Instance Neighborhood Aggregation Open
The eXtreme Multi-label Classification~(XMC) problem seeks to find relevant labels from an exceptionally large label space. Most of the existing XMC learners focus on the extraction of semantic features from input query text. However, conv…
View article: Unlearning Graph Classifiers with Limited Data Resources
Unlearning Graph Classifiers with Limited Data Resources Open
As the demand for user privacy grows, controlled data removal (machine unlearning) is becoming an important feature of machine learning models for data-sensitive Web applications such as social networks and recommender systems. Nevertheles…
View article: Certified Graph Unlearning
Certified Graph Unlearning Open
Graph-structured data is ubiquitous in practice and often processed using graph neural networks (GNNs). With the adoption of recent laws ensuring the ``right to be forgotten'', the problem of graph data removal has become of significant im…
View article: HyperAid: Denoising in hyperbolic spaces for tree-fitting and hierarchical clustering
HyperAid: Denoising in hyperbolic spaces for tree-fitting and hierarchical clustering Open
The problem of fitting distances by tree-metrics has received significant attention in the theoretical computer science and machine learning communities alike, due to many applications in natural language processing, phylogeny, cancer geno…
View article: Small-Sample Estimation of the Mutational Support and Distribution of SARS-CoV-2
Small-Sample Estimation of the Mutational Support and Distribution of SARS-CoV-2 Open
We consider the problem of determining the mutational support and distribution of the SARS-CoV-2 viral genome in the small-sample regime. The mutational support refers to the unknown number of sites that may eventually mutate in the SARS-C…
View article: Provably Accurate and Scalable Linear Classifiers in Hyperbolic Spaces
Provably Accurate and Scalable Linear Classifiers in Hyperbolic Spaces Open
Many high-dimensional practical data sets have hierarchical structures induced by graphs or time series. Such data sets are hard to process in Euclidean spaces and one often seeks low-dimensional embeddings in other space forms to perform …
View article: Node Feature Extraction by Self-Supervised Multi-scale Neighborhood\n Prediction
Node Feature Extraction by Self-Supervised Multi-scale Neighborhood\n Prediction Open
Learning on graphs has attracted significant attention in the learning\ncommunity due to numerous real-world applications. In particular, graph neural\nnetworks (GNNs), which take numerical node features and graph structure as\ninputs, hav…
View article: Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction
Node Feature Extraction by Self-Supervised Multi-scale Neighborhood Prediction Open
Learning on graphs has attracted significant attention in the learning community due to numerous real-world applications. In particular, graph neural networks (GNNs), which take numerical node features and graph structure as inputs, have b…
View article: Landing Probabilities of Random Walks for Seed-Set Expansion in Hypergraphs
Landing Probabilities of Random Walks for Seed-Set Expansion in Hypergraphs Open
We describe the first known mean-field study of landing probabilities for random walks on hypergraphs. In particular, we examine clique-expansion and tensor methods and evaluate their mean-field characteristics over a class of random hyper…
View article: Highly Scalable and Provably Accurate Classification in Poincare Balls
Highly Scalable and Provably Accurate Classification in Poincare Balls Open
Many high-dimensional and large-volume data sets of practical relevance have hierarchical structures induced by trees, graphs or time series. Such data sets are hard to process in Euclidean spaces and one often seeks low-dimensional embedd…
View article: Support Estimation with Sampling Artifacts and Errors
Support Estimation with Sampling Artifacts and Errors Open
The problem of estimating the support of a distribution is of great importance in many areas of machine learning, computer science, physics and biology. Most of the existing work in this domain has focused on settings that assume perfectly…
View article: You are AllSet: A Multiset Function Framework for Hypergraph Neural\n Networks
You are AllSet: A Multiset Function Framework for Hypergraph Neural\n Networks Open
Hypergraphs are used to model higher-order interactions amongst agents and\nthere exist many practically relevant instances of hypergraph datasets. To\nenable efficient processing of hypergraph-structured data, several hypergraph\nneural n…
View article: You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks
You are AllSet: A Multiset Function Framework for Hypergraph Neural Networks Open
Hypergraphs are used to model higher-order interactions amongst agents and there exist many practically relevant instances of hypergraph datasets. To enable efficient processing of hypergraph-structured data, several hypergraph neural netw…
View article: Linear Classifiers in Mixed Constant Curvature Spaces.
Linear Classifiers in Mixed Constant Curvature Spaces. Open
Embedding methods for mixed-curvature spaces are powerful techniques for low-distortion and low-dimensional representation of complex data structures. Nevertheless, little is known regarding downstream learning and optimization in the embe…