Hierarchical Blockmodelling for Knowledge Graphs Article Swipe
YOU?
·
· 2024
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2408.15649
In this paper, we investigate the use of probabilistic graphical models, specifically stochastic blockmodels, for the purpose of hierarchical entity clustering on knowledge graphs. These models, seldom used in the Semantic Web community, decompose a graph into a set of probability distributions. The parameters of these distributions are then inferred allowing for their subsequent sampling to generate a random graph. In a non-parametric setting, this allows for the induction of hierarchical clusterings without prior constraints on the hierarchy's structure. Specifically, this is achieved by the integration of the Nested Chinese Restaurant Process and the Stick Breaking Process into the generative model. In this regard, we propose a model leveraging such integration and derive a collapsed Gibbs sampling scheme for its inference. To aid in understanding, we describe the steps in this derivation and provide an implementation for the sampler. We evaluate our model on synthetic and real-world datasets and quantitatively compare against benchmark models. We further evaluate our results qualitatively and find that our model is capable of inducing coherent cluster hierarchies in small scale settings. The work presented in this paper provides the first step for the further application of stochastic blockmodels for knowledge graphs on a larger scale. We conclude the paper with potential avenues for future work on more scalable inference schemes.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2408.15649
- https://arxiv.org/pdf/2408.15649
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4402705914
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4402705914Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2408.15649Digital Object Identifier
- Title
-
Hierarchical Blockmodelling for Knowledge GraphsWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2024Year of publication
- Publication date
-
2024-08-28Full publication date if available
- Authors
-
Marcin Pietrasik, Marek Reformat, Anna WilbikList of authors in order
- Landing page
-
https://arxiv.org/abs/2408.15649Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2408.15649Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2408.15649Direct OA link when available
- Concepts
-
Knowledge graph, Computer science, Geography, Artificial intelligenceTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4402705914 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2408.15649 |
| ids.doi | https://doi.org/10.48550/arxiv.2408.15649 |
| ids.openalex | https://openalex.org/W4402705914 |
| fwci | |
| type | preprint |
| title | Hierarchical Blockmodelling for Knowledge Graphs |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T10215 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.9891999959945679 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Semantic Web and Ontologies |
| topics[1].id | https://openalex.org/T13062 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9729999899864197 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Cognitive Computing and Networks |
| topics[2].id | https://openalex.org/T11063 |
| topics[2].field.id | https://openalex.org/fields/17 |
| topics[2].field.display_name | Computer Science |
| topics[2].score | 0.9351000189781189 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1703 |
| topics[2].subfield.display_name | Computational Theory and Mathematics |
| topics[2].display_name | Rough Sets and Fuzzy Logic |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C2987255567 |
| concepts[0].level | 2 |
| concepts[0].score | 0.5747204422950745 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q33002955 |
| concepts[0].display_name | Knowledge graph |
| concepts[1].id | https://openalex.org/C41008148 |
| concepts[1].level | 0 |
| concepts[1].score | 0.3766721785068512 |
| concepts[1].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[1].display_name | Computer science |
| concepts[2].id | https://openalex.org/C205649164 |
| concepts[2].level | 0 |
| concepts[2].score | 0.3706980347633362 |
| concepts[2].wikidata | https://www.wikidata.org/wiki/Q1071 |
| concepts[2].display_name | Geography |
| concepts[3].id | https://openalex.org/C154945302 |
| concepts[3].level | 1 |
| concepts[3].score | 0.205821692943573 |
| concepts[3].wikidata | https://www.wikidata.org/wiki/Q11660 |
| concepts[3].display_name | Artificial intelligence |
| keywords[0].id | https://openalex.org/keywords/knowledge-graph |
| keywords[0].score | 0.5747204422950745 |
| keywords[0].display_name | Knowledge graph |
| keywords[1].id | https://openalex.org/keywords/computer-science |
| keywords[1].score | 0.3766721785068512 |
| keywords[1].display_name | Computer science |
| keywords[2].id | https://openalex.org/keywords/geography |
| keywords[2].score | 0.3706980347633362 |
| keywords[2].display_name | Geography |
| keywords[3].id | https://openalex.org/keywords/artificial-intelligence |
| keywords[3].score | 0.205821692943573 |
| keywords[3].display_name | Artificial intelligence |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2408.15649 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2408.15649 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2408.15649 |
| locations[1].id | doi:10.48550/arxiv.2408.15649 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | cc-by |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | https://openalex.org/licenses/cc-by |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2408.15649 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5038564154 |
| authorships[0].author.orcid | https://orcid.org/0000-0001-7559-8658 |
| authorships[0].author.display_name | Marcin Pietrasik |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Pietrasik, Marcin |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5024453737 |
| authorships[1].author.orcid | https://orcid.org/0000-0003-4783-0717 |
| authorships[1].author.display_name | Marek Reformat |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Reformat, Marek |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5011946737 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-1989-0301 |
| authorships[2].author.display_name | Anna Wilbik |
| authorships[2].author_position | last |
| authorships[2].raw_author_name | Wilbik, Anna |
| authorships[2].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2408.15649 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Hierarchical Blockmodelling for Knowledge Graphs |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T10215 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.9891999959945679 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Semantic Web and Ontologies |
| related_works | https://openalex.org/W4391375266, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W2358668433, https://openalex.org/W4396701345, https://openalex.org/W2376932109, https://openalex.org/W2001405890, https://openalex.org/W4396696052, https://openalex.org/W2382290278, https://openalex.org/W4395014643 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2408.15649 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2408.15649 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2408.15649 |
| primary_location.id | pmh:oai:arXiv.org:2408.15649 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2408.15649 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2408.15649 |
| publication_date | 2024-08-28 |
| publication_year | 2024 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 34, 37, 57, 61, 106, 113, 197 |
| abstract_inverted_index.In | 0, 60, 101 |
| abstract_inverted_index.To | 121 |
| abstract_inverted_index.We | 139, 154, 200 |
| abstract_inverted_index.an | 134 |
| abstract_inverted_index.by | 83 |
| abstract_inverted_index.in | 28, 123, 129, 172, 179 |
| abstract_inverted_index.is | 81, 165 |
| abstract_inverted_index.of | 7, 17, 39, 44, 69, 86, 167, 190 |
| abstract_inverted_index.on | 21, 75, 143, 196, 210 |
| abstract_inverted_index.to | 55 |
| abstract_inverted_index.we | 3, 104, 125 |
| abstract_inverted_index.The | 42, 176 |
| abstract_inverted_index.Web | 31 |
| abstract_inverted_index.aid | 122 |
| abstract_inverted_index.and | 92, 111, 132, 145, 148, 160 |
| abstract_inverted_index.are | 47 |
| abstract_inverted_index.for | 14, 51, 66, 118, 136, 186, 193, 207 |
| abstract_inverted_index.its | 119 |
| abstract_inverted_index.our | 141, 157, 163 |
| abstract_inverted_index.set | 38 |
| abstract_inverted_index.the | 5, 15, 29, 67, 76, 84, 87, 93, 98, 127, 137, 183, 187, 202 |
| abstract_inverted_index.use | 6 |
| abstract_inverted_index.find | 161 |
| abstract_inverted_index.into | 36, 97 |
| abstract_inverted_index.more | 211 |
| abstract_inverted_index.step | 185 |
| abstract_inverted_index.such | 109 |
| abstract_inverted_index.that | 162 |
| abstract_inverted_index.then | 48 |
| abstract_inverted_index.this | 1, 64, 80, 102, 130, 180 |
| abstract_inverted_index.used | 27 |
| abstract_inverted_index.with | 204 |
| abstract_inverted_index.work | 177, 209 |
| abstract_inverted_index.Gibbs | 115 |
| abstract_inverted_index.Stick | 94 |
| abstract_inverted_index.These | 24 |
| abstract_inverted_index.first | 184 |
| abstract_inverted_index.graph | 35 |
| abstract_inverted_index.model | 107, 142, 164 |
| abstract_inverted_index.paper | 181, 203 |
| abstract_inverted_index.prior | 73 |
| abstract_inverted_index.scale | 174 |
| abstract_inverted_index.small | 173 |
| abstract_inverted_index.steps | 128 |
| abstract_inverted_index.their | 52 |
| abstract_inverted_index.these | 45 |
| abstract_inverted_index.Nested | 88 |
| abstract_inverted_index.allows | 65 |
| abstract_inverted_index.derive | 112 |
| abstract_inverted_index.entity | 19 |
| abstract_inverted_index.future | 208 |
| abstract_inverted_index.graph. | 59 |
| abstract_inverted_index.graphs | 195 |
| abstract_inverted_index.larger | 198 |
| abstract_inverted_index.model. | 100 |
| abstract_inverted_index.paper, | 2 |
| abstract_inverted_index.random | 58 |
| abstract_inverted_index.scale. | 199 |
| abstract_inverted_index.scheme | 117 |
| abstract_inverted_index.seldom | 26 |
| abstract_inverted_index.Chinese | 89 |
| abstract_inverted_index.Process | 91, 96 |
| abstract_inverted_index.against | 151 |
| abstract_inverted_index.avenues | 206 |
| abstract_inverted_index.capable | 166 |
| abstract_inverted_index.cluster | 170 |
| abstract_inverted_index.compare | 150 |
| abstract_inverted_index.further | 155, 188 |
| abstract_inverted_index.graphs. | 23 |
| abstract_inverted_index.models, | 10, 25 |
| abstract_inverted_index.models. | 153 |
| abstract_inverted_index.propose | 105 |
| abstract_inverted_index.provide | 133 |
| abstract_inverted_index.purpose | 16 |
| abstract_inverted_index.regard, | 103 |
| abstract_inverted_index.results | 158 |
| abstract_inverted_index.without | 72 |
| abstract_inverted_index.Breaking | 95 |
| abstract_inverted_index.Semantic | 30 |
| abstract_inverted_index.achieved | 82 |
| abstract_inverted_index.allowing | 50 |
| abstract_inverted_index.coherent | 169 |
| abstract_inverted_index.conclude | 201 |
| abstract_inverted_index.datasets | 147 |
| abstract_inverted_index.describe | 126 |
| abstract_inverted_index.evaluate | 140, 156 |
| abstract_inverted_index.generate | 56 |
| abstract_inverted_index.inducing | 168 |
| abstract_inverted_index.inferred | 49 |
| abstract_inverted_index.provides | 182 |
| abstract_inverted_index.sampler. | 138 |
| abstract_inverted_index.sampling | 54, 116 |
| abstract_inverted_index.scalable | 212 |
| abstract_inverted_index.schemes. | 214 |
| abstract_inverted_index.setting, | 63 |
| abstract_inverted_index.benchmark | 152 |
| abstract_inverted_index.collapsed | 114 |
| abstract_inverted_index.decompose | 33 |
| abstract_inverted_index.graphical | 9 |
| abstract_inverted_index.induction | 68 |
| abstract_inverted_index.inference | 213 |
| abstract_inverted_index.knowledge | 22, 194 |
| abstract_inverted_index.potential | 205 |
| abstract_inverted_index.presented | 178 |
| abstract_inverted_index.settings. | 175 |
| abstract_inverted_index.synthetic | 144 |
| abstract_inverted_index.Restaurant | 90 |
| abstract_inverted_index.clustering | 20 |
| abstract_inverted_index.community, | 32 |
| abstract_inverted_index.derivation | 131 |
| abstract_inverted_index.generative | 99 |
| abstract_inverted_index.inference. | 120 |
| abstract_inverted_index.leveraging | 108 |
| abstract_inverted_index.parameters | 43 |
| abstract_inverted_index.real-world | 146 |
| abstract_inverted_index.stochastic | 12, 191 |
| abstract_inverted_index.structure. | 78 |
| abstract_inverted_index.subsequent | 53 |
| abstract_inverted_index.application | 189 |
| abstract_inverted_index.blockmodels | 192 |
| abstract_inverted_index.clusterings | 71 |
| abstract_inverted_index.constraints | 74 |
| abstract_inverted_index.hierarchies | 171 |
| abstract_inverted_index.hierarchy's | 77 |
| abstract_inverted_index.integration | 85, 110 |
| abstract_inverted_index.investigate | 4 |
| abstract_inverted_index.probability | 40 |
| abstract_inverted_index.blockmodels, | 13 |
| abstract_inverted_index.hierarchical | 18, 70 |
| abstract_inverted_index.specifically | 11 |
| abstract_inverted_index.Specifically, | 79 |
| abstract_inverted_index.distributions | 46 |
| abstract_inverted_index.probabilistic | 8 |
| abstract_inverted_index.qualitatively | 159 |
| abstract_inverted_index.distributions. | 41 |
| abstract_inverted_index.implementation | 135 |
| abstract_inverted_index.non-parametric | 62 |
| abstract_inverted_index.quantitatively | 149 |
| abstract_inverted_index.understanding, | 124 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 3 |
| citation_normalized_percentile |