When Is Prior Knowledge Helpful? Exploring the Evaluation and Selection of Unsupervised Pretext Tasks from a Neuro-Symbolic Perspective Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2508.07299
Neuro-symbolic (Nesy) learning improves the target task performance of models by enabling them to satisfy knowledge, while semi/self-supervised learning (SSL) improves the target task performance by designing unsupervised pretext tasks for unlabeled data to make models satisfy corresponding assumptions. We extend the Nesy theory based on reliable knowledge to the scenario of unreliable knowledge (i.e., assumptions), thereby unifying the theoretical frameworks of SSL and Nesy. Through rigorous theoretical analysis, we demonstrate that, in theory, the impact of pretext tasks on target performance hinges on three factors: knowledge learnability with respect to the model, knowledge reliability with respect to the data, and knowledge completeness with respect to the target. We further propose schemes to operationalize these theoretical metrics, and thereby develop a method that can predict the effectiveness of pretext tasks in advance. This will change the current status quo in practical applications, where the selections of unsupervised tasks are heuristic-based rather than theory-based, and it is difficult to evaluate the rationality of unsupervised pretext task selection before testing the model on the target task. In experiments, we verify a high correlation between the predicted performance-estimated using minimal data-and the actual performance achieved after large-scale semi-supervised or self-supervised learning, thus confirming the validity of the theory and the effectiveness of the evaluation method.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2508.07299
- https://arxiv.org/pdf/2508.07299
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4416241982
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4416241982Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2508.07299Digital Object Identifier
- Title
-
When Is Prior Knowledge Helpful? Exploring the Evaluation and Selection of Unsupervised Pretext Tasks from a Neuro-Symbolic PerspectiveWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-08-10Full publication date if available
- Authors
-
Lin-Han Jia, Siyu Han, Wenchao Hu, Jie-Jing Shao, Wei Wei, Zhi Zhou, Lan-Zhe Guo, Yu-Feng LiList of authors in order
- Landing page
-
https://arxiv.org/abs/2508.07299Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2508.07299Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2508.07299Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4416241982 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2508.07299 |
| ids.doi | https://doi.org/10.48550/arxiv.2508.07299 |
| ids.openalex | https://openalex.org/W4416241982 |
| fwci | |
| type | preprint |
| title | When Is Prior Knowledge Helpful? Exploring the Evaluation and Selection of Unsupervised Pretext Tasks from a Neuro-Symbolic Perspective |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2508.07299 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2508.07299 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2508.07299 |
| locations[1].id | doi:10.48550/arxiv.2508.07299 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2508.07299 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5100595775 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Lin-Han Jia |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Jia, Lin-Han |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5006922566 |
| authorships[1].author.orcid | https://orcid.org/0000-0001-8709-5564 |
| authorships[1].author.display_name | Siyu Han |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Han, Si-Yu |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5112700933 |
| authorships[2].author.orcid | |
| authorships[2].author.display_name | Wenchao Hu |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Hu, Wen-Chao |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5087294333 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-8107-114X |
| authorships[3].author.display_name | Jie-Jing Shao |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Shao, Jie-Jing |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5100323678 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-0960-7269 |
| authorships[4].author.display_name | Wei Wei |
| authorships[4].author_position | middle |
| authorships[4].raw_author_name | Wei, Wen-Da |
| authorships[4].is_corresponding | False |
| authorships[5].author.id | https://openalex.org/A5101040069 |
| authorships[5].author.orcid | |
| authorships[5].author.display_name | Zhi Zhou |
| authorships[5].author_position | middle |
| authorships[5].raw_author_name | Zhou, Zhi |
| authorships[5].is_corresponding | False |
| authorships[6].author.id | https://openalex.org/A5047808444 |
| authorships[6].author.orcid | https://orcid.org/0000-0001-8965-1288 |
| authorships[6].author.display_name | Lan-Zhe Guo |
| authorships[6].author_position | middle |
| authorships[6].raw_author_name | Guo, Lan-Zhe |
| authorships[6].is_corresponding | False |
| authorships[7].author.id | https://openalex.org/A5100355152 |
| authorships[7].author.orcid | https://orcid.org/0000-0002-7727-4304 |
| authorships[7].author.display_name | Yu-Feng Li |
| authorships[7].author_position | last |
| authorships[7].raw_author_name | Li, Yu-Feng |
| authorships[7].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2508.07299 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | When Is Prior Knowledge Helpful? Exploring the Evaluation and Selection of Unsupervised Pretext Tasks from a Neuro-Symbolic Perspective |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-28T09:05:27.415739 |
| primary_topic | |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2508.07299 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2508.07299 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2508.07299 |
| primary_location.id | pmh:oai:arXiv.org:2508.07299 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2508.07299 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2508.07299 |
| publication_date | 2025-08-10 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 120, 178 |
| abstract_inverted_index.In | 174 |
| abstract_inverted_index.We | 39, 108 |
| abstract_inverted_index.by | 10, 25 |
| abstract_inverted_index.in | 72, 130, 139 |
| abstract_inverted_index.is | 155 |
| abstract_inverted_index.it | 154 |
| abstract_inverted_index.of | 8, 51, 61, 76, 127, 145, 161, 202, 208 |
| abstract_inverted_index.on | 45, 79, 83, 170 |
| abstract_inverted_index.or | 195 |
| abstract_inverted_index.to | 13, 33, 48, 90, 97, 105, 112, 157 |
| abstract_inverted_index.we | 69, 176 |
| abstract_inverted_index.SSL | 62 |
| abstract_inverted_index.and | 63, 100, 117, 153, 205 |
| abstract_inverted_index.are | 148 |
| abstract_inverted_index.can | 123 |
| abstract_inverted_index.for | 30 |
| abstract_inverted_index.quo | 138 |
| abstract_inverted_index.the | 4, 21, 41, 49, 58, 74, 91, 98, 106, 125, 135, 143, 159, 168, 171, 182, 188, 200, 203, 206, 209 |
| abstract_inverted_index.Nesy | 42 |
| abstract_inverted_index.This | 132 |
| abstract_inverted_index.data | 32 |
| abstract_inverted_index.high | 179 |
| abstract_inverted_index.make | 34 |
| abstract_inverted_index.task | 6, 23, 164 |
| abstract_inverted_index.than | 151 |
| abstract_inverted_index.that | 122 |
| abstract_inverted_index.them | 12 |
| abstract_inverted_index.thus | 198 |
| abstract_inverted_index.will | 133 |
| abstract_inverted_index.with | 88, 95, 103 |
| abstract_inverted_index.(SSL) | 19 |
| abstract_inverted_index.Nesy. | 64 |
| abstract_inverted_index.after | 192 |
| abstract_inverted_index.based | 44 |
| abstract_inverted_index.data, | 99 |
| abstract_inverted_index.model | 169 |
| abstract_inverted_index.task. | 173 |
| abstract_inverted_index.tasks | 29, 78, 129, 147 |
| abstract_inverted_index.that, | 71 |
| abstract_inverted_index.these | 114 |
| abstract_inverted_index.three | 84 |
| abstract_inverted_index.using | 185 |
| abstract_inverted_index.where | 142 |
| abstract_inverted_index.while | 16 |
| abstract_inverted_index.(Nesy) | 1 |
| abstract_inverted_index.(i.e., | 54 |
| abstract_inverted_index.actual | 189 |
| abstract_inverted_index.before | 166 |
| abstract_inverted_index.change | 134 |
| abstract_inverted_index.extend | 40 |
| abstract_inverted_index.hinges | 82 |
| abstract_inverted_index.impact | 75 |
| abstract_inverted_index.method | 121 |
| abstract_inverted_index.model, | 92 |
| abstract_inverted_index.models | 9, 35 |
| abstract_inverted_index.rather | 150 |
| abstract_inverted_index.status | 137 |
| abstract_inverted_index.target | 5, 22, 80, 172 |
| abstract_inverted_index.theory | 43, 204 |
| abstract_inverted_index.verify | 177 |
| abstract_inverted_index.Through | 65 |
| abstract_inverted_index.between | 181 |
| abstract_inverted_index.current | 136 |
| abstract_inverted_index.develop | 119 |
| abstract_inverted_index.further | 109 |
| abstract_inverted_index.method. | 211 |
| abstract_inverted_index.minimal | 186 |
| abstract_inverted_index.predict | 124 |
| abstract_inverted_index.pretext | 28, 77, 128, 163 |
| abstract_inverted_index.propose | 110 |
| abstract_inverted_index.respect | 89, 96, 104 |
| abstract_inverted_index.satisfy | 14, 36 |
| abstract_inverted_index.schemes | 111 |
| abstract_inverted_index.target. | 107 |
| abstract_inverted_index.testing | 167 |
| abstract_inverted_index.theory, | 73 |
| abstract_inverted_index.thereby | 56, 118 |
| abstract_inverted_index.achieved | 191 |
| abstract_inverted_index.advance. | 131 |
| abstract_inverted_index.data-and | 187 |
| abstract_inverted_index.enabling | 11 |
| abstract_inverted_index.evaluate | 158 |
| abstract_inverted_index.factors: | 85 |
| abstract_inverted_index.improves | 3, 20 |
| abstract_inverted_index.learning | 2, 18 |
| abstract_inverted_index.metrics, | 116 |
| abstract_inverted_index.reliable | 46 |
| abstract_inverted_index.rigorous | 66 |
| abstract_inverted_index.scenario | 50 |
| abstract_inverted_index.unifying | 57 |
| abstract_inverted_index.validity | 201 |
| abstract_inverted_index.analysis, | 68 |
| abstract_inverted_index.designing | 26 |
| abstract_inverted_index.difficult | 156 |
| abstract_inverted_index.knowledge | 47, 53, 86, 93, 101 |
| abstract_inverted_index.learning, | 197 |
| abstract_inverted_index.practical | 140 |
| abstract_inverted_index.predicted | 183 |
| abstract_inverted_index.selection | 165 |
| abstract_inverted_index.unlabeled | 31 |
| abstract_inverted_index.confirming | 199 |
| abstract_inverted_index.evaluation | 210 |
| abstract_inverted_index.frameworks | 60 |
| abstract_inverted_index.knowledge, | 15 |
| abstract_inverted_index.selections | 144 |
| abstract_inverted_index.unreliable | 52 |
| abstract_inverted_index.correlation | 180 |
| abstract_inverted_index.demonstrate | 70 |
| abstract_inverted_index.large-scale | 193 |
| abstract_inverted_index.performance | 7, 24, 81, 190 |
| abstract_inverted_index.rationality | 160 |
| abstract_inverted_index.reliability | 94 |
| abstract_inverted_index.theoretical | 59, 67, 115 |
| abstract_inverted_index.assumptions. | 38 |
| abstract_inverted_index.completeness | 102 |
| abstract_inverted_index.experiments, | 175 |
| abstract_inverted_index.learnability | 87 |
| abstract_inverted_index.unsupervised | 27, 146, 162 |
| abstract_inverted_index.applications, | 141 |
| abstract_inverted_index.assumptions), | 55 |
| abstract_inverted_index.corresponding | 37 |
| abstract_inverted_index.effectiveness | 126, 207 |
| abstract_inverted_index.theory-based, | 152 |
| abstract_inverted_index.Neuro-symbolic | 0 |
| abstract_inverted_index.operationalize | 113 |
| abstract_inverted_index.heuristic-based | 149 |
| abstract_inverted_index.self-supervised | 196 |
| abstract_inverted_index.semi-supervised | 194 |
| abstract_inverted_index.semi/self-supervised | 17 |
| abstract_inverted_index.performance-estimated | 184 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 8 |
| citation_normalized_percentile |