Deep Active Learning with Crowdsourcing Data for Privacy Policy Classification Article Swipe
YOU?
·
· 2020
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2008.02954
Privacy policies are statements that notify users of the services' data practices. However, few users are willing to read through policy texts due to the length and complexity. While automated tools based on machine learning exist for privacy policy analysis, to achieve high classification accuracy, classifiers need to be trained on a large labeled dataset. Most existing policy corpora are labeled by skilled human annotators, requiring significant amount of labor hours and effort. In this paper, we leverage active learning and crowdsourcing techniques to develop an automated classification tool named Calpric (Crowdsourcing Active Learning PRIvacy Policy Classifier), which is able to perform annotation equivalent to those done by skilled human annotators with high accuracy while minimizing the labeling cost. Specifically, active learning allows classifiers to proactively select the most informative segments to be labeled. On average, our model is able to achieve the same F1 score using only 62% of the original labeling effort. Calpric's use of active learning also addresses naturally occurring class imbalance in unlabeled privacy policy datasets as there are many more statements stating the collection of private information than stating the absence of collection. By selecting samples from the minority class for labeling, Calpric automatically creates a more balanced training set.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2008.02954
- https://arxiv.org/pdf/2008.02954
- OA Status
- green
- Cited By
- 1
- References
- 46
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W3048081621
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W3048081621Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2008.02954Digital Object Identifier
- Title
-
Deep Active Learning with Crowdsourcing Data for Privacy Policy ClassificationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2020Year of publication
- Publication date
-
2020-08-07Full publication date if available
- Authors
-
Wenjun Qiu, David LieList of authors in order
- Landing page
-
https://arxiv.org/abs/2008.02954Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2008.02954Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2008.02954Direct OA link when available
- Concepts
-
Crowdsourcing, Computer science, Leverage (statistics), Classifier (UML), Machine learning, Artificial intelligence, Privacy policy, Annotation, Labeled data, Class (philosophy), Active learning (machine learning), Information privacy, World Wide Web, Internet privacyTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
1Total citation count in OpenAlex
- Citations by year (recent)
-
2024: 1Per-year citation counts (last 5 years)
- References (count)
-
46Number of works referenced by this work
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W3048081621 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2008.02954 |
| ids.doi | https://doi.org/10.48550/arxiv.2008.02954 |
| ids.mag | 3048081621 |
| ids.openalex | https://openalex.org/W3048081621 |
| fwci | |
| type | preprint |
| title | Deep Active Learning with Crowdsourcing Data for Privacy Policy Classification |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11704 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9984999895095825 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1706 |
| topics[0].subfield.display_name | Computer Science Applications |
| topics[0].display_name | Mobile Crowdsensing and Crowdsourcing |
| topics[1].id | https://openalex.org/T10764 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9983000159263611 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Privacy-Preserving Technologies in Data |
| topics[2].id | https://openalex.org/T12072 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9979000091552734 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1702 |
| topics[2].subfield.display_name | Artificial Intelligence |
| topics[2].display_name | Machine Learning and Algorithms |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C62230096 |
| concepts[0].level | 2 |
| concepts[0].score | 0.8950792551040649 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q275969 |
| concepts[0].display_name | Crowdsourcing |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.7917567491531372 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C153083717 |
| concepts[2].level | 2 |
| concepts[2].score | 0.7523858547210693 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q6535263 |
| concepts[2].display_name | Leverage (statistics) |
| concepts[3].id | https://openalex.org/C95623464 |
| concepts[3].level | 2 |
| concepts[3].score | 0.7110389471054077 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q1096149 |
| concepts[3].display_name | Classifier (UML) |
| concepts[4].id | https://openalex.org/C119857082 |
| concepts[4].level | 1 |
| concepts[4].score | 0.6846998333930969 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q2539 |
| concepts[4].display_name | Machine learning |
| concepts[5].id | https://openalex.org/C154945302 |
| concepts[5].level | 1 |
| concepts[5].score | 0.6840596199035645 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[5].display_name | Artificial intelligence |
| concepts[6].id | https://openalex.org/C102938260 |
| concepts[6].level | 3 |
| concepts[6].score | 0.5942631959915161 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q1999831 |
| concepts[6].display_name | Privacy policy |
| concepts[7].id | https://openalex.org/C2776321320 |
| concepts[7].level | 2 |
| concepts[7].score | 0.5351335406303406 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q857525 |
| concepts[7].display_name | Annotation |
| concepts[8].id | https://openalex.org/C2776145971 |
| concepts[8].level | 2 |
| concepts[8].score | 0.47873565554618835 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q30673951 |
| concepts[8].display_name | Labeled data |
| concepts[9].id | https://openalex.org/C2777212361 |
| concepts[9].level | 2 |
| concepts[9].score | 0.45977815985679626 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q5127848 |
| concepts[9].display_name | Class (philosophy) |
| concepts[10].id | https://openalex.org/C77967617 |
| concepts[10].level | 2 |
| concepts[10].score | 0.4585113525390625 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q4677561 |
| concepts[10].display_name | Active learning (machine learning) |
| concepts[11].id | https://openalex.org/C123201435 |
| concepts[11].level | 2 |
| concepts[11].score | 0.32968565821647644 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q456632 |
| concepts[11].display_name | Information privacy |
| concepts[12].id | https://openalex.org/C136764020 |
| concepts[12].level | 1 |
| concepts[12].score | 0.2118965983390808 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q466 |
| concepts[12].display_name | World Wide Web |
| concepts[13].id | https://openalex.org/C108827166 |
| concepts[13].level | 1 |
| concepts[13].score | 0.1455395221710205 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q175975 |
| concepts[13].display_name | Internet privacy |
| keywords[0].id | https://openalex.org/keywords/crowdsourcing |
| keywords[0].score | 0.8950792551040649 |
| keywords[0].display_name | Crowdsourcing |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.7917567491531372 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/leverage |
| keywords[2].score | 0.7523858547210693 |
| keywords[2].display_name | Leverage (statistics) |
| keywords[3].id | https://openalex.org/keywords/classifier |
| keywords[3].score | 0.7110389471054077 |
| keywords[3].display_name | Classifier (UML) |
| keywords[4].id | https://openalex.org/keywords/machine-learning |
| keywords[4].score | 0.6846998333930969 |
| keywords[4].display_name | Machine learning |
| keywords[5].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[5].score | 0.6840596199035645 |
| keywords[5].display_name | Artificial intelligence |
| keywords[6].id | https://openalex.org/keywords/privacy-policy |
| keywords[6].score | 0.5942631959915161 |
| keywords[6].display_name | Privacy policy |
| keywords[7].id | https://openalex.org/keywords/annotation |
| keywords[7].score | 0.5351335406303406 |
| keywords[7].display_name | Annotation |
| keywords[8].id | https://openalex.org/keywords/labeled-data |
| keywords[8].score | 0.47873565554618835 |
| keywords[8].display_name | Labeled data |
| keywords[9].id | https://openalex.org/keywords/class |
| keywords[9].score | 0.45977815985679626 |
| keywords[9].display_name | Class (philosophy) |
| keywords[10].id | https://openalex.org/keywords/active-learning |
| keywords[10].score | 0.4585113525390625 |
| keywords[10].display_name | Active learning (machine learning) |
| keywords[11].id | https://openalex.org/keywords/information-privacy |
| keywords[11].score | 0.32968565821647644 |
| keywords[11].display_name | Information privacy |
| keywords[12].id | https://openalex.org/keywords/world-wide-web |
| keywords[12].score | 0.2118965983390808 |
| keywords[12].display_name | World Wide Web |
| keywords[13].id | https://openalex.org/keywords/internet-privacy |
| keywords[13].score | 0.1455395221710205 |
| keywords[13].display_name | Internet privacy |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2008.02954 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2008.02954 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2008.02954 |
| locations[1].id | doi:10.48550/arxiv.2008.02954 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2008.02954 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5065411994 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Wenjun Qiu |
| authorships[0].countries | CA |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I185261750 |
| authorships[0].affiliations[0].raw_affiliation_string | University of Toronto |
| authorships[0].institutions[0].id | https://openalex.org/I185261750 |
| authorships[0].institutions[0].ror | https://ror.org/03dbr7087 |
| authorships[0].institutions[0].type | education |
| authorships[0].institutions[0].lineage | https://openalex.org/I185261750 |
| authorships[0].institutions[0].country_code | CA |
| authorships[0].institutions[0].display_name | University of Toronto |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Wenjun Qiu |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | University of Toronto |
| authorships[1].author.id | https://openalex.org/A5049933072 |
| authorships[1].author.orcid | https://orcid.org/0000-0002-2000-6827 |
| authorships[1].author.display_name | David Lie |
| authorships[1].countries | CA |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I185261750 |
| authorships[1].affiliations[0].raw_affiliation_string | University of Toronto |
| authorships[1].institutions[0].id | https://openalex.org/I185261750 |
| authorships[1].institutions[0].ror | https://ror.org/03dbr7087 |
| authorships[1].institutions[0].type | education |
| authorships[1].institutions[0].lineage | https://openalex.org/I185261750 |
| authorships[1].institutions[0].country_code | CA |
| authorships[1].institutions[0].display_name | University of Toronto |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | David Lie |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | University of Toronto |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2008.02954 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2020-08-13T00:00:00 |
| display_name | Deep Active Learning with Crowdsourcing Data for Privacy Policy Classification |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11704 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9984999895095825 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1706 |
| primary_topic.subfield.display_name | Computer Science Applications |
| primary_topic.display_name | Mobile Crowdsensing and Crowdsourcing |
| related_works | https://openalex.org/W2116878667, https://openalex.org/W3042284153, https://openalex.org/W4241527182, https://openalex.org/W2476957992, https://openalex.org/W1493227450, https://openalex.org/W2900699882, https://openalex.org/W4250923762, https://openalex.org/W576625533, https://openalex.org/W2025792237, https://openalex.org/W857570378 |
| cited_by_count | 1 |
| counts_by_year[0].year | 2024 |
| counts_by_year[0].cited_by_count | 1 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2008.02954 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2008.02954 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2008.02954 |
| primary_location.id | pmh:oai:arXiv.org:2008.02954 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2008.02954 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2008.02954 |
| publication_date | 2020-08-07 |
| publication_year | 2020 |
| referenced_works | https://openalex.org/W1580375566, https://openalex.org/W2552147134, https://openalex.org/W2207751595, https://openalex.org/W2952115634, https://openalex.org/W2117763124, https://openalex.org/W2775423040, https://openalex.org/W2517394750, https://openalex.org/W1513874326, https://openalex.org/W2785787385, https://openalex.org/W2963070937, https://openalex.org/W2914331073, https://openalex.org/W2338318698, https://openalex.org/W2612622960, https://openalex.org/W2493916176, https://openalex.org/W1521626219, https://openalex.org/W1882358344, https://openalex.org/W1522301498, https://openalex.org/W2541743403, https://openalex.org/W1510943169, https://openalex.org/W2171671120, https://openalex.org/W1706374111, https://openalex.org/W2899488745, https://openalex.org/W2896457183, https://openalex.org/W2041404167, https://openalex.org/W2905475835, https://openalex.org/W2796375395, https://openalex.org/W1493726523, https://openalex.org/W2951786554, https://openalex.org/W2339322788, https://openalex.org/W2960406524, https://openalex.org/W2048089596, https://openalex.org/W658020064, https://openalex.org/W2787106847, https://openalex.org/W2194775991, https://openalex.org/W2187456046, https://openalex.org/W2963917928, https://openalex.org/W2250879510, https://openalex.org/W2102162869, https://openalex.org/W2999309192, https://openalex.org/W2426031434, https://openalex.org/W2556888587, https://openalex.org/W2095705004, https://openalex.org/W1484084878, https://openalex.org/W2407776548, https://openalex.org/W2038496767, https://openalex.org/W2540392403 |
| referenced_works_count | 46 |
| abstract_inverted_index.a | 51, 200 |
| abstract_inverted_index.By | 188 |
| abstract_inverted_index.F1 | 144 |
| abstract_inverted_index.In | 73 |
| abstract_inverted_index.On | 134 |
| abstract_inverted_index.an | 85 |
| abstract_inverted_index.as | 170 |
| abstract_inverted_index.be | 48, 132 |
| abstract_inverted_index.by | 61, 107 |
| abstract_inverted_index.in | 165 |
| abstract_inverted_index.is | 98, 138 |
| abstract_inverted_index.of | 7, 68, 149, 156, 179, 186 |
| abstract_inverted_index.on | 32, 50 |
| abstract_inverted_index.to | 17, 23, 40, 47, 83, 100, 104, 124, 131, 140 |
| abstract_inverted_index.we | 76 |
| abstract_inverted_index.62% | 148 |
| abstract_inverted_index.and | 26, 71, 80 |
| abstract_inverted_index.are | 2, 15, 59, 172 |
| abstract_inverted_index.due | 22 |
| abstract_inverted_index.few | 13 |
| abstract_inverted_index.for | 36, 195 |
| abstract_inverted_index.our | 136 |
| abstract_inverted_index.the | 8, 24, 116, 127, 142, 150, 177, 184, 192 |
| abstract_inverted_index.use | 155 |
| abstract_inverted_index.Most | 55 |
| abstract_inverted_index.able | 99, 139 |
| abstract_inverted_index.also | 159 |
| abstract_inverted_index.data | 10 |
| abstract_inverted_index.done | 106 |
| abstract_inverted_index.from | 191 |
| abstract_inverted_index.high | 42, 112 |
| abstract_inverted_index.many | 173 |
| abstract_inverted_index.more | 174, 201 |
| abstract_inverted_index.most | 128 |
| abstract_inverted_index.need | 46 |
| abstract_inverted_index.only | 147 |
| abstract_inverted_index.read | 18 |
| abstract_inverted_index.same | 143 |
| abstract_inverted_index.set. | 204 |
| abstract_inverted_index.than | 182 |
| abstract_inverted_index.that | 4 |
| abstract_inverted_index.this | 74 |
| abstract_inverted_index.tool | 88 |
| abstract_inverted_index.with | 111 |
| abstract_inverted_index.While | 28 |
| abstract_inverted_index.based | 31 |
| abstract_inverted_index.class | 163, 194 |
| abstract_inverted_index.cost. | 118 |
| abstract_inverted_index.exist | 35 |
| abstract_inverted_index.hours | 70 |
| abstract_inverted_index.human | 63, 109 |
| abstract_inverted_index.labor | 69 |
| abstract_inverted_index.large | 52 |
| abstract_inverted_index.model | 137 |
| abstract_inverted_index.named | 89 |
| abstract_inverted_index.score | 145 |
| abstract_inverted_index.texts | 21 |
| abstract_inverted_index.there | 171 |
| abstract_inverted_index.those | 105 |
| abstract_inverted_index.tools | 30 |
| abstract_inverted_index.users | 6, 14 |
| abstract_inverted_index.using | 146 |
| abstract_inverted_index.which | 97 |
| abstract_inverted_index.while | 114 |
| abstract_inverted_index.Active | 92 |
| abstract_inverted_index.Policy | 95 |
| abstract_inverted_index.active | 78, 120, 157 |
| abstract_inverted_index.allows | 122 |
| abstract_inverted_index.amount | 67 |
| abstract_inverted_index.length | 25 |
| abstract_inverted_index.notify | 5 |
| abstract_inverted_index.paper, | 75 |
| abstract_inverted_index.policy | 20, 38, 57, 168 |
| abstract_inverted_index.select | 126 |
| abstract_inverted_index.Calpric | 90, 197 |
| abstract_inverted_index.PRIvacy | 94 |
| abstract_inverted_index.Privacy | 0 |
| abstract_inverted_index.absence | 185 |
| abstract_inverted_index.achieve | 41, 141 |
| abstract_inverted_index.corpora | 58 |
| abstract_inverted_index.creates | 199 |
| abstract_inverted_index.develop | 84 |
| abstract_inverted_index.effort. | 72, 153 |
| abstract_inverted_index.labeled | 53, 60 |
| abstract_inverted_index.machine | 33 |
| abstract_inverted_index.perform | 101 |
| abstract_inverted_index.privacy | 37, 167 |
| abstract_inverted_index.private | 180 |
| abstract_inverted_index.samples | 190 |
| abstract_inverted_index.skilled | 62, 108 |
| abstract_inverted_index.stating | 176, 183 |
| abstract_inverted_index.through | 19 |
| abstract_inverted_index.trained | 49 |
| abstract_inverted_index.willing | 16 |
| abstract_inverted_index.However, | 12 |
| abstract_inverted_index.Learning | 93 |
| abstract_inverted_index.accuracy | 113 |
| abstract_inverted_index.average, | 135 |
| abstract_inverted_index.balanced | 202 |
| abstract_inverted_index.dataset. | 54 |
| abstract_inverted_index.datasets | 169 |
| abstract_inverted_index.existing | 56 |
| abstract_inverted_index.labeled. | 133 |
| abstract_inverted_index.labeling | 117, 152 |
| abstract_inverted_index.learning | 34, 79, 121, 158 |
| abstract_inverted_index.leverage | 77 |
| abstract_inverted_index.minority | 193 |
| abstract_inverted_index.original | 151 |
| abstract_inverted_index.policies | 1 |
| abstract_inverted_index.segments | 130 |
| abstract_inverted_index.training | 203 |
| abstract_inverted_index.Calpric's | 154 |
| abstract_inverted_index.accuracy, | 44 |
| abstract_inverted_index.addresses | 160 |
| abstract_inverted_index.analysis, | 39 |
| abstract_inverted_index.automated | 29, 86 |
| abstract_inverted_index.imbalance | 164 |
| abstract_inverted_index.labeling, | 196 |
| abstract_inverted_index.naturally | 161 |
| abstract_inverted_index.occurring | 162 |
| abstract_inverted_index.requiring | 65 |
| abstract_inverted_index.selecting | 189 |
| abstract_inverted_index.services' | 9 |
| abstract_inverted_index.unlabeled | 166 |
| abstract_inverted_index.annotation | 102 |
| abstract_inverted_index.annotators | 110 |
| abstract_inverted_index.collection | 178 |
| abstract_inverted_index.equivalent | 103 |
| abstract_inverted_index.minimizing | 115 |
| abstract_inverted_index.practices. | 11 |
| abstract_inverted_index.statements | 3, 175 |
| abstract_inverted_index.techniques | 82 |
| abstract_inverted_index.annotators, | 64 |
| abstract_inverted_index.classifiers | 45, 123 |
| abstract_inverted_index.collection. | 187 |
| abstract_inverted_index.complexity. | 27 |
| abstract_inverted_index.information | 181 |
| abstract_inverted_index.informative | 129 |
| abstract_inverted_index.proactively | 125 |
| abstract_inverted_index.significant | 66 |
| abstract_inverted_index.Classifier), | 96 |
| abstract_inverted_index.Specifically, | 119 |
| abstract_inverted_index.automatically | 198 |
| abstract_inverted_index.crowdsourcing | 81 |
| abstract_inverted_index.(Crowdsourcing | 91 |
| abstract_inverted_index.classification | 43, 87 |
| cited_by_percentile_year | |
| countries_distinct_count | 1 |
| institutions_distinct_count | 2 |
| sustainable_development_goals[0].id | https://metadata.un.org/sdg/8 |
| sustainable_development_goals[0].score | 0.6600000262260437 |
| sustainable_development_goals[0].display_name | Decent work and economic growth |
| citation_normalized_percentile |