arXiv (Cornell University)
Deep Active Learning with Crowdsourcing Data for Privacy Policy Classification
August 2020 • Wenjun Qiu, David Lie
Privacy policies are statements that notify users of the services' data practices. However, few users are willing to read through policy texts due to the length and complexity. While automated tools based on machine learning exist for privacy policy analysis, to achieve high classification accuracy, classifiers need to be trained on a large labeled dataset. Most existing policy corpora are labeled by skilled human annotators, requiring significant amount of labor hours and effort. In this paper, we leverage active…