Ted Sandler
YOU?
Author Swipe
View article: Fine-tuning Vision Classifiers On A Budget
Fine-tuning Vision Classifiers On A Budget Open
Fine-tuning modern computer vision models requires accurately labeled data for which the ground truth may not exist, but a set of multiple labels can be obtained from labelers of variable accuracy. We tie the notion of label quality to con…
View article: Boosted Off-Policy Learning
Boosted Off-Policy Learning Open
We propose the first boosting algorithm for off-policy learning from logged bandit feedback. Unlike existing boosting methods for supervised learning, our algorithm directly optimizes an estimate of the policy's expected reward. We analyze…
View article: Bayesian Counterfactual Risk Minimization
Bayesian Counterfactual Risk Minimization Open
We present a Bayesian view of counterfactual risk minimization (CRM) for offline learning from logged bandit feedback. Using PAC-Bayesian analysis, we derive a new generalization bound for the truncated inverse propensity score estimator. …