Group Testing for Accurate and Efficient Range-Based Near Neighbor Search for Plagiarism Detection Article Swipe
YOU?
·
· 2023
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2311.02573
This work presents an adaptive group testing framework for the range-based high dimensional near neighbor search problem. Our method efficiently marks each item in a database as neighbor or non-neighbor of a query point, based on a cosine distance threshold without exhaustive search. Like other methods for large scale retrieval, our approach exploits the assumption that most of the items in the database are unrelated to the query. However, it does not assume a large difference between the cosine similarity of the query vector with the least related neighbor and that with the least unrelated non-neighbor. Following a multi-stage adaptive group testing algorithm based on binary splitting, we divide the set of items to be searched into half at each step, and perform dot product tests on smaller and smaller subsets, many of which we are able to prune away. We show that, using softmax-based features, our method achieves a more than ten-fold speed-up over exhaustive search with no loss of accuracy, on a variety of large datasets. Based on empirically verified models for the distribution of cosine distances, we present a theoretical analysis of the expected number of distance computations per query and the probability that a pool will be pruned. Our method has the following features: (i) It implicitly exploits useful distributional properties of cosine distances unlike other methods; (ii) All required data structures are created purely offline; (iii) It does not impose any strong assumptions on the number of true near neighbors; (iv) It is adaptable to streaming settings where new vectors are dynamically added to the database; and (v) It does not require any parameter tuning. The high recall of our technique makes it particularly suited to plagiarism detection scenarios where it is important to report every database item that is sufficiently similar item to the query.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2311.02573
- https://arxiv.org/pdf/2311.02573
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4388482439
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4388482439Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2311.02573Digital Object Identifier
- Title
-
Group Testing for Accurate and Efficient Range-Based Near Neighbor Search for Plagiarism DetectionWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2023Year of publication
- Publication date
-
2023-11-05Full publication date if available
- Authors
-
Kashish Mittal, Harsh Shah, Ajit RajwadeList of authors in order
- Landing page
-
https://arxiv.org/abs/2311.02573Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2311.02573Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2311.02573Direct OA link when available
- Concepts
-
Nearest neighbor search, Best bin first, Cosine similarity, Dot product, Computer science, Set (abstract data type), Range (aeronautics), Algorithm, Binary search algorithm, k-nearest neighbors algorithm, Binary number, Similarity (geometry), Pattern recognition (psychology), Data mining, Search algorithm, Mathematics, Image (mathematics), Artificial intelligence, Materials science, Geometry, Programming language, Arithmetic, Composite materialTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4388482439 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2311.02573 |
| ids.doi | https://doi.org/10.48550/arxiv.2311.02573 |
| ids.openalex | https://openalex.org/W4388482439 |
| fwci | |
| type | preprint |
| title | Group Testing for Accurate and Efficient Range-Based Near Neighbor Search for Plagiarism Detection |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11754 |
| topics[0].field.id | https://openalex.org/fields/27 |
| topics[0].field.display_name | Medicine |
| topics[0].score | 0.9990000128746033 |
| topics[0].domain.id | https://openalex.org/domains/4 |
| topics[0].domain.display_name | Health Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2725 |
| topics[0].subfield.display_name | Infectious Diseases |
| topics[0].display_name | SARS-CoV-2 detection and testing |
| topics[1].id | https://openalex.org/T11393 |
| topics[1].field.id | https://openalex.org/fields/22 |
| topics[1].field.display_name | Engineering |
| topics[1].score | 0.9872000217437744 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2204 |
| topics[1].subfield.display_name | Biomedical Engineering |
| topics[1].display_name | Biosensors and Analytical Detection |
| topics[2].id | https://openalex.org/T10207 |
| topics[2].field.id | https://openalex.org/fields/13 |
| topics[2].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[2].score | 0.9847000241279602 |
| topics[2].domain.id | https://openalex.org/domains/1 |
| topics[2].domain.display_name | Life Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1312 |
| topics[2].subfield.display_name | Molecular Biology |
| topics[2].display_name | Advanced biosensing and bioanalysis techniques |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C116738811 |
| concepts[0].level | 2 |
| concepts[0].score | 0.7694472074508667 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q608751 |
| concepts[0].display_name | Nearest neighbor search |
| concepts[1].id | https://openalex.org/C161986146 |
| concepts[1].level | 3 |
| concepts[1].score | 0.6101216077804565 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q4896845 |
| concepts[1].display_name | Best bin first |
| concepts[2].id | https://openalex.org/C2780762811 |
| concepts[2].level | 3 |
| concepts[2].score | 0.5526425242424011 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q1784941 |
| concepts[2].display_name | Cosine similarity |
| concepts[3].id | https://openalex.org/C32900221 |
| concepts[3].level | 2 |
| concepts[3].score | 0.5133754014968872 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q181365 |
| concepts[3].display_name | Dot product |
| concepts[4].id | https://openalex.org/C41008148 |
| concepts[4].level | 0 |
| concepts[4].score | 0.5111778378486633 |
| concepts[4].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[4].display_name | Computer science |
| concepts[5].id | https://openalex.org/C177264268 |
| concepts[5].level | 2 |
| concepts[5].score | 0.4980192184448242 |
| concepts[5].wikidata | https://www.wikidata.org/wiki/Q1514741 |
| concepts[5].display_name | Set (abstract data type) |
| concepts[6].id | https://openalex.org/C204323151 |
| concepts[6].level | 2 |
| concepts[6].score | 0.49169355630874634 |
| concepts[6].wikidata | https://www.wikidata.org/wiki/Q905424 |
| concepts[6].display_name | Range (aeronautics) |
| concepts[7].id | https://openalex.org/C11413529 |
| concepts[7].level | 1 |
| concepts[7].score | 0.4663298428058624 |
| concepts[7].wikidata | https://www.wikidata.org/wiki/Q8366 |
| concepts[7].display_name | Algorithm |
| concepts[8].id | https://openalex.org/C121610932 |
| concepts[8].level | 3 |
| concepts[8].score | 0.459652841091156 |
| concepts[8].wikidata | https://www.wikidata.org/wiki/Q243754 |
| concepts[8].display_name | Binary search algorithm |
| concepts[9].id | https://openalex.org/C113238511 |
| concepts[9].level | 2 |
| concepts[9].score | 0.4504547715187073 |
| concepts[9].wikidata | https://www.wikidata.org/wiki/Q1071612 |
| concepts[9].display_name | k-nearest neighbors algorithm |
| concepts[10].id | https://openalex.org/C48372109 |
| concepts[10].level | 2 |
| concepts[10].score | 0.43218040466308594 |
| concepts[10].wikidata | https://www.wikidata.org/wiki/Q3913 |
| concepts[10].display_name | Binary number |
| concepts[11].id | https://openalex.org/C103278499 |
| concepts[11].level | 3 |
| concepts[11].score | 0.4263957738876343 |
| concepts[11].wikidata | https://www.wikidata.org/wiki/Q254465 |
| concepts[11].display_name | Similarity (geometry) |
| concepts[12].id | https://openalex.org/C153180895 |
| concepts[12].level | 2 |
| concepts[12].score | 0.3977355360984802 |
| concepts[12].wikidata | https://www.wikidata.org/wiki/Q7148389 |
| concepts[12].display_name | Pattern recognition (psychology) |
| concepts[13].id | https://openalex.org/C124101348 |
| concepts[13].level | 1 |
| concepts[13].score | 0.3974217176437378 |
| concepts[13].wikidata | https://www.wikidata.org/wiki/Q172491 |
| concepts[13].display_name | Data mining |
| concepts[14].id | https://openalex.org/C125583679 |
| concepts[14].level | 2 |
| concepts[14].score | 0.3745371103286743 |
| concepts[14].wikidata | https://www.wikidata.org/wiki/Q755673 |
| concepts[14].display_name | Search algorithm |
| concepts[15].id | https://openalex.org/C33923547 |
| concepts[15].level | 0 |
| concepts[15].score | 0.36276328563690186 |
| concepts[15].wikidata | https://www.wikidata.org/wiki/Q395 |
| concepts[15].display_name | Mathematics |
| concepts[16].id | https://openalex.org/C115961682 |
| concepts[16].level | 2 |
| concepts[16].score | 0.28161686658859253 |
| concepts[16].wikidata | https://www.wikidata.org/wiki/Q860623 |
| concepts[16].display_name | Image (mathematics) |
| concepts[17].id | https://openalex.org/C154945302 |
| concepts[17].level | 1 |
| concepts[17].score | 0.216994971036911 |
| concepts[17].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[17].display_name | Artificial intelligence |
| concepts[18].id | https://openalex.org/C192562407 |
| concepts[18].level | 0 |
| concepts[18].score | 0.0 |
| concepts[18].wikidata | https://www.wikidata.org/wiki/Q228736 |
| concepts[18].display_name | Materials science |
| concepts[19].id | https://openalex.org/C2524010 |
| concepts[19].level | 1 |
| concepts[19].score | 0.0 |
| concepts[19].wikidata | https://www.wikidata.org/wiki/Q8087 |
| concepts[19].display_name | Geometry |
| concepts[20].id | https://openalex.org/C199360897 |
| concepts[20].level | 1 |
| concepts[20].score | 0.0 |
| concepts[20].wikidata | https://www.wikidata.org/wiki/Q9143 |
| concepts[20].display_name | Programming language |
| concepts[21].id | https://openalex.org/C94375191 |
| concepts[21].level | 1 |
| concepts[21].score | 0.0 |
| concepts[21].wikidata | https://www.wikidata.org/wiki/Q11205 |
| concepts[21].display_name | Arithmetic |
| concepts[22].id | https://openalex.org/C159985019 |
| concepts[22].level | 1 |
| concepts[22].score | 0.0 |
| concepts[22].wikidata | https://www.wikidata.org/wiki/Q181790 |
| concepts[22].display_name | Composite material |
| keywords[0].id | https://openalex.org/keywords/nearest-neighbor-search |
| keywords[0].score | 0.7694472074508667 |
| keywords[0].display_name | Nearest neighbor search |
| keywords[1].id | https://openalex.org/keywords/best-bin-first |
| keywords[1].score | 0.6101216077804565 |
| keywords[1].display_name | Best bin first |
| keywords[2].id | https://openalex.org/keywords/cosine-similarity |
| keywords[2].score | 0.5526425242424011 |
| keywords[2].display_name | Cosine similarity |
| keywords[3].id | https://openalex.org/keywords/dot-product |
| keywords[3].score | 0.5133754014968872 |
| keywords[3].display_name | Dot product |
| keywords[4].id | https://openalex.org/keywords/computer-science |
| keywords[4].score | 0.5111778378486633 |
| keywords[4].display_name | Computer science |
| keywords[5].id | https://openalex.org/keywords/set |
| keywords[5].score | 0.4980192184448242 |
| keywords[5].display_name | Set (abstract data type) |
| keywords[6].id | https://openalex.org/keywords/range |
| keywords[6].score | 0.49169355630874634 |
| keywords[6].display_name | Range (aeronautics) |
| keywords[7].id | https://openalex.org/keywords/algorithm |
| keywords[7].score | 0.4663298428058624 |
| keywords[7].display_name | Algorithm |
| keywords[8].id | https://openalex.org/keywords/binary-search-algorithm |
| keywords[8].score | 0.459652841091156 |
| keywords[8].display_name | Binary search algorithm |
| keywords[9].id | https://openalex.org/keywords/k-nearest-neighbors-algorithm |
| keywords[9].score | 0.4504547715187073 |
| keywords[9].display_name | k-nearest neighbors algorithm |
| keywords[10].id | https://openalex.org/keywords/binary-number |
| keywords[10].score | 0.43218040466308594 |
| keywords[10].display_name | Binary number |
| keywords[11].id | https://openalex.org/keywords/similarity |
| keywords[11].score | 0.4263957738876343 |
| keywords[11].display_name | Similarity (geometry) |
| keywords[12].id | https://openalex.org/keywords/pattern-recognition |
| keywords[12].score | 0.3977355360984802 |
| keywords[12].display_name | Pattern recognition (psychology) |
| keywords[13].id | https://openalex.org/keywords/data-mining |
| keywords[13].score | 0.3974217176437378 |
| keywords[13].display_name | Data mining |
| keywords[14].id | https://openalex.org/keywords/search-algorithm |
| keywords[14].score | 0.3745371103286743 |
| keywords[14].display_name | Search algorithm |
| keywords[15].id | https://openalex.org/keywords/mathematics |
| keywords[15].score | 0.36276328563690186 |
| keywords[15].display_name | Mathematics |
| keywords[16].id | https://openalex.org/keywords/image |
| keywords[16].score | 0.28161686658859253 |
| keywords[16].display_name | Image (mathematics) |
| keywords[17].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[17].score | 0.216994971036911 |
| keywords[17].display_name | Artificial intelligence |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2311.02573 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2311.02573 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2311.02573 |
| locations[1].id | doi:10.48550/arxiv.2311.02573 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2311.02573 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5025715576 |
| authorships[0].author.orcid | https://orcid.org/0009-0000-2835-3797 |
| authorships[0].author.display_name | Kashish Mittal |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Mittal, Kashish |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5102730674 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-3903-1309 |
| authorships[1].author.display_name | Harsh Shah |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Shah, Harsh |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5072824358 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-6463-3315 |
| authorships[2].author.display_name | Ajit Rajwade |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Rajwade, Ajit |
| authorships[2].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2311.02573 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2023-11-08T00:00:00 |
| display_name | Group Testing for Accurate and Efficient Range-Based Near Neighbor Search for Plagiarism Detection |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11754 |
| primary_topic.field.id | https://openalex.org/fields/27 |
| primary_topic.field.display_name | Medicine |
| primary_topic.score | 0.9990000128746033 |
| primary_topic.domain.id | https://openalex.org/domains/4 |
| primary_topic.domain.display_name | Health Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2725 |
| primary_topic.subfield.display_name | Infectious Diseases |
| primary_topic.display_name | SARS-CoV-2 detection and testing |
| related_works | https://openalex.org/W2381195555, https://openalex.org/W4246757943, https://openalex.org/W2182477562, https://openalex.org/W2109424811, https://openalex.org/W1595303882, https://openalex.org/W1558159560, https://openalex.org/W2148008870, https://openalex.org/W1517788997, https://openalex.org/W4388482439, https://openalex.org/W2124509324 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2311.02573 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2311.02573 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2311.02573 |
| primary_location.id | pmh:oai:arXiv.org:2311.02573 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2311.02573 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2311.02573 |
| publication_date | 2023-11-05 |
| publication_year | 2023 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 24, 31, 36, 73, 97, 149, 163, 181, 197 |
| abstract_inverted_index.It | 209, 231, 246, 263 |
| abstract_inverted_index.We | 140 |
| abstract_inverted_index.an | 3 |
| abstract_inverted_index.as | 26 |
| abstract_inverted_index.at | 118 |
| abstract_inverted_index.be | 114, 200 |
| abstract_inverted_index.in | 23, 60 |
| abstract_inverted_index.is | 247, 286, 294 |
| abstract_inverted_index.it | 69, 277, 285 |
| abstract_inverted_index.no | 158 |
| abstract_inverted_index.of | 30, 57, 80, 111, 132, 160, 165, 176, 184, 188, 215, 241, 273 |
| abstract_inverted_index.on | 35, 104, 126, 162, 169, 238 |
| abstract_inverted_index.or | 28 |
| abstract_inverted_index.to | 65, 113, 137, 249, 258, 280, 288, 298 |
| abstract_inverted_index.we | 107, 134, 179 |
| abstract_inverted_index.(i) | 208 |
| abstract_inverted_index.(v) | 262 |
| abstract_inverted_index.All | 222 |
| abstract_inverted_index.Our | 17, 202 |
| abstract_inverted_index.The | 270 |
| abstract_inverted_index.and | 89, 121, 128, 193, 261 |
| abstract_inverted_index.any | 235, 267 |
| abstract_inverted_index.are | 63, 135, 226, 255 |
| abstract_inverted_index.dot | 123 |
| abstract_inverted_index.for | 8, 46, 173 |
| abstract_inverted_index.has | 204 |
| abstract_inverted_index.new | 253 |
| abstract_inverted_index.not | 71, 233, 265 |
| abstract_inverted_index.our | 50, 146, 274 |
| abstract_inverted_index.per | 191 |
| abstract_inverted_index.set | 110 |
| abstract_inverted_index.the | 9, 53, 58, 61, 66, 77, 81, 85, 92, 109, 174, 185, 194, 205, 239, 259, 299 |
| abstract_inverted_index.(ii) | 221 |
| abstract_inverted_index.(iv) | 245 |
| abstract_inverted_index.Like | 43 |
| abstract_inverted_index.This | 0 |
| abstract_inverted_index.able | 136 |
| abstract_inverted_index.data | 224 |
| abstract_inverted_index.does | 70, 232, 264 |
| abstract_inverted_index.each | 21, 119 |
| abstract_inverted_index.half | 117 |
| abstract_inverted_index.high | 11, 271 |
| abstract_inverted_index.into | 116 |
| abstract_inverted_index.item | 22, 292, 297 |
| abstract_inverted_index.loss | 159 |
| abstract_inverted_index.many | 131 |
| abstract_inverted_index.more | 150 |
| abstract_inverted_index.most | 56 |
| abstract_inverted_index.near | 13, 243 |
| abstract_inverted_index.over | 154 |
| abstract_inverted_index.pool | 198 |
| abstract_inverted_index.show | 141 |
| abstract_inverted_index.than | 151 |
| abstract_inverted_index.that | 55, 90, 196, 293 |
| abstract_inverted_index.true | 242 |
| abstract_inverted_index.will | 199 |
| abstract_inverted_index.with | 84, 91, 157 |
| abstract_inverted_index.work | 1 |
| abstract_inverted_index.(iii) | 230 |
| abstract_inverted_index.Based | 168 |
| abstract_inverted_index.added | 257 |
| abstract_inverted_index.away. | 139 |
| abstract_inverted_index.based | 34, 103 |
| abstract_inverted_index.every | 290 |
| abstract_inverted_index.group | 5, 100 |
| abstract_inverted_index.items | 59, 112 |
| abstract_inverted_index.large | 47, 74, 166 |
| abstract_inverted_index.least | 86, 93 |
| abstract_inverted_index.makes | 276 |
| abstract_inverted_index.marks | 20 |
| abstract_inverted_index.other | 44, 219 |
| abstract_inverted_index.prune | 138 |
| abstract_inverted_index.query | 32, 82, 192 |
| abstract_inverted_index.scale | 48 |
| abstract_inverted_index.step, | 120 |
| abstract_inverted_index.tests | 125 |
| abstract_inverted_index.that, | 142 |
| abstract_inverted_index.using | 143 |
| abstract_inverted_index.where | 252, 284 |
| abstract_inverted_index.which | 133 |
| abstract_inverted_index.assume | 72 |
| abstract_inverted_index.binary | 105 |
| abstract_inverted_index.cosine | 37, 78, 177, 216 |
| abstract_inverted_index.divide | 108 |
| abstract_inverted_index.impose | 234 |
| abstract_inverted_index.method | 18, 147, 203 |
| abstract_inverted_index.models | 172 |
| abstract_inverted_index.number | 187, 240 |
| abstract_inverted_index.point, | 33 |
| abstract_inverted_index.purely | 228 |
| abstract_inverted_index.query. | 67, 300 |
| abstract_inverted_index.recall | 272 |
| abstract_inverted_index.report | 289 |
| abstract_inverted_index.search | 15, 156 |
| abstract_inverted_index.strong | 236 |
| abstract_inverted_index.suited | 279 |
| abstract_inverted_index.unlike | 218 |
| abstract_inverted_index.useful | 212 |
| abstract_inverted_index.vector | 83 |
| abstract_inverted_index.between | 76 |
| abstract_inverted_index.created | 227 |
| abstract_inverted_index.methods | 45 |
| abstract_inverted_index.perform | 122 |
| abstract_inverted_index.present | 180 |
| abstract_inverted_index.product | 124 |
| abstract_inverted_index.pruned. | 201 |
| abstract_inverted_index.related | 87 |
| abstract_inverted_index.require | 266 |
| abstract_inverted_index.search. | 42 |
| abstract_inverted_index.similar | 296 |
| abstract_inverted_index.smaller | 127, 129 |
| abstract_inverted_index.testing | 6, 101 |
| abstract_inverted_index.tuning. | 269 |
| abstract_inverted_index.variety | 164 |
| abstract_inverted_index.vectors | 254 |
| abstract_inverted_index.without | 40 |
| abstract_inverted_index.However, | 68 |
| abstract_inverted_index.achieves | 148 |
| abstract_inverted_index.adaptive | 4, 99 |
| abstract_inverted_index.analysis | 183 |
| abstract_inverted_index.approach | 51 |
| abstract_inverted_index.database | 25, 62, 291 |
| abstract_inverted_index.distance | 38, 189 |
| abstract_inverted_index.expected | 186 |
| abstract_inverted_index.exploits | 52, 211 |
| abstract_inverted_index.methods; | 220 |
| abstract_inverted_index.neighbor | 14, 27, 88 |
| abstract_inverted_index.offline; | 229 |
| abstract_inverted_index.presents | 2 |
| abstract_inverted_index.problem. | 16 |
| abstract_inverted_index.required | 223 |
| abstract_inverted_index.searched | 115 |
| abstract_inverted_index.settings | 251 |
| abstract_inverted_index.speed-up | 153 |
| abstract_inverted_index.subsets, | 130 |
| abstract_inverted_index.ten-fold | 152 |
| abstract_inverted_index.verified | 171 |
| abstract_inverted_index.Following | 96 |
| abstract_inverted_index.accuracy, | 161 |
| abstract_inverted_index.adaptable | 248 |
| abstract_inverted_index.algorithm | 102 |
| abstract_inverted_index.database; | 260 |
| abstract_inverted_index.datasets. | 167 |
| abstract_inverted_index.detection | 282 |
| abstract_inverted_index.distances | 217 |
| abstract_inverted_index.features, | 145 |
| abstract_inverted_index.features: | 207 |
| abstract_inverted_index.following | 206 |
| abstract_inverted_index.framework | 7 |
| abstract_inverted_index.important | 287 |
| abstract_inverted_index.parameter | 268 |
| abstract_inverted_index.scenarios | 283 |
| abstract_inverted_index.streaming | 250 |
| abstract_inverted_index.technique | 275 |
| abstract_inverted_index.threshold | 39 |
| abstract_inverted_index.unrelated | 64, 94 |
| abstract_inverted_index.assumption | 54 |
| abstract_inverted_index.difference | 75 |
| abstract_inverted_index.distances, | 178 |
| abstract_inverted_index.exhaustive | 41, 155 |
| abstract_inverted_index.implicitly | 210 |
| abstract_inverted_index.neighbors; | 244 |
| abstract_inverted_index.plagiarism | 281 |
| abstract_inverted_index.properties | 214 |
| abstract_inverted_index.retrieval, | 49 |
| abstract_inverted_index.similarity | 79 |
| abstract_inverted_index.splitting, | 106 |
| abstract_inverted_index.structures | 225 |
| abstract_inverted_index.assumptions | 237 |
| abstract_inverted_index.dimensional | 12 |
| abstract_inverted_index.dynamically | 256 |
| abstract_inverted_index.efficiently | 19 |
| abstract_inverted_index.empirically | 170 |
| abstract_inverted_index.multi-stage | 98 |
| abstract_inverted_index.probability | 195 |
| abstract_inverted_index.range-based | 10 |
| abstract_inverted_index.theoretical | 182 |
| abstract_inverted_index.computations | 190 |
| abstract_inverted_index.distribution | 175 |
| abstract_inverted_index.non-neighbor | 29 |
| abstract_inverted_index.particularly | 278 |
| abstract_inverted_index.sufficiently | 295 |
| abstract_inverted_index.non-neighbor. | 95 |
| abstract_inverted_index.softmax-based | 144 |
| abstract_inverted_index.distributional | 213 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| citation_normalized_percentile |