Striped Data Analysis Framework Article Swipe
YOU?
·
· 2019
· Open Access
·
· DOI: https://doi.org/10.1051/epjconf/202024506042/pdf
Traditionally, High Energy data analysis is based on the model where data are stored in files and analyzed by running multiple analysis processes, each reading one or more of the data files. This process involves repeated data reduction step, which produces smaller files, which is time consuming and leads to data duplication. We propose an alternative approach to data storage and analysis, based on the Big Data technologies. The idea is to store each element of data once and only once in a distributed scalable database and analyze data by reading only "interesting" pieces of data from the database. To make this approach possible, we developed columnar Striped Data Representation Format as the basis of the framework. Traditional columnar approach allows for efficient analysis of complex data structures. While keeping all the benefits of columnar data representation, striped mechanism goes further by enabling efficient parallelization of computations and flexible dist ribution of data analysis. The framework includes scalable and elastic data storage, compute and user analysis backend components. The framework uses off-the shelf web services and data caching technologies as the compute/data co-location mechanism. Flexible architecture allows the framework run in the cloud using container technologies. The framework offers Python/Jupyter as the user analysis backend platform, but can also run in command line or batch mode. In the article, we will present the results of the FNAL-LDRD-2016-032 FNAL LDRD project, the design, implementation, features and performance characteristics of the Striped Data Analysis Framework.
Related Topics
- Type
- paratext
- Language
- en
- https://www.epj-conferences.org/10.1051/epjconf/202024506042/pdf
- OA Status
- green
- Related Works
- 10
- OpenAlex ID
- https://openalex.org/W4288031063
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4288031063Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.1051/epjconf/202024506042/pdfDigital Object Identifier
- Title
-
Striped Data Analysis FrameworkWork title
- Type
-
paratextOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2019Year of publication
- Publication date
-
2019-11-25Full publication date if available
- Authors
-
O. Gutsche, I. V. MandrichenkoList of authors in order
- PDF URL
-
https://www.epj-conferences.org/10.1051/epjconf/202024506042/pdfDirect link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://www.epj-conferences.org/10.1051/epjconf/202024506042/pdfDirect OA link when available
- Concepts
-
Computer scienceTop concepts (fields/topics) attached by OpenAlex
- Cited by
-
0Total citation count in OpenAlex
- Related works (count)
-
10Other works algorithmically related by OpenAlex
Full payload
| id | https://openalex.org/W4288031063 |
|---|---|
| doi | https://doi.org/10.1051/epjconf/202024506042/pdf |
| ids.openalex | https://openalex.org/W4288031063 |
| fwci | |
| type | paratext |
| title | Striped Data Analysis Framework |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11512 |
| topics[0].field.id | https://openalex.org/fields/17 |
| topics[0].field.display_name | Computer Science |
| topics[0].score | 0.13740000128746033 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1702 |
| topics[0].subfield.display_name | Artificial Intelligence |
| topics[0].display_name | Anomaly Detection Techniques and Applications |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| concepts[0].id | https://openalex.org/C41008148 |
| concepts[0].level | 0 |
| concepts[0].score | 0.6529140472412109 |
| concepts[0].wikidata | https://www.wikidata.org/wiki/Q21198 |
| concepts[0].display_name | Computer science |
| keywords[0].id | https://openalex.org/keywords/computer-science |
| keywords[0].score | 0.6529140472412109 |
| keywords[0].display_name | Computer science |
| language | en |
| locations[0].id | pmh:oai:edpsciences.org:dkey/10.1051/epjconf/202024506042 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400744 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | False |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | Springer Link (Chiba Institute of Technology) |
| locations[0].source.host_organization | https://openalex.org/I8488066 |
| locations[0].source.host_organization_name | Chiba Institute of Technology |
| locations[0].source.host_organization_lineage | https://openalex.org/I8488066 |
| locations[0].license | cc-by |
| locations[0].pdf_url | https://www.epj-conferences.org/10.1051/epjconf/202024506042/pdf |
| locations[0].version | submittedVersion |
| locations[0].raw_type | Text |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | https://doi.org/10.1051/epjconf/202024506042 |
| locations[0].landing_page_url | |
| locations[1].id | pmh:oai:osti.gov:1574833 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306402487 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | False |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | OSTI OAI (U.S. Department of Energy Office of Scientific and Technical Information) |
| locations[1].source.host_organization | https://openalex.org/I139351228 |
| locations[1].source.host_organization_name | Office of Scientific and Technical Information |
| locations[1].source.host_organization_lineage | https://openalex.org/I139351228 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | submittedVersion |
| locations[1].raw_type | |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | False |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://www.osti.gov/biblio/1574833 |
| authorships[0].author.id | https://openalex.org/A5101728612 |
| authorships[0].author.orcid | https://orcid.org/0000-0002-8015-9622 |
| authorships[0].author.display_name | O. Gutsche |
| authorships[0].countries | US |
| authorships[0].affiliations[0].institution_ids | https://openalex.org/I1314696892 |
| authorships[0].affiliations[0].raw_affiliation_string | Fermilab |
| authorships[0].institutions[0].id | https://openalex.org/I1314696892 |
| authorships[0].institutions[0].ror | https://ror.org/020hgte69 |
| authorships[0].institutions[0].type | facility |
| authorships[0].institutions[0].lineage | https://openalex.org/I1314696892, https://openalex.org/I1330989302, https://openalex.org/I39565521, https://openalex.org/I4210114836 |
| authorships[0].institutions[0].country_code | US |
| authorships[0].institutions[0].display_name | Fermi National Accelerator Laboratory |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Gutsche, Oliver |
| authorships[0].is_corresponding | False |
| authorships[0].raw_affiliation_strings | Fermilab |
| authorships[1].author.id | https://openalex.org/A5112854293 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | I. V. Mandrichenko |
| authorships[1].countries | US |
| authorships[1].affiliations[0].institution_ids | https://openalex.org/I1314696892 |
| authorships[1].affiliations[0].raw_affiliation_string | Fermilab |
| authorships[1].institutions[0].id | https://openalex.org/I1314696892 |
| authorships[1].institutions[0].ror | https://ror.org/020hgte69 |
| authorships[1].institutions[0].type | facility |
| authorships[1].institutions[0].lineage | https://openalex.org/I1314696892, https://openalex.org/I1330989302, https://openalex.org/I39565521, https://openalex.org/I4210114836 |
| authorships[1].institutions[0].country_code | US |
| authorships[1].institutions[0].display_name | Fermi National Accelerator Laboratory |
| authorships[1].author_position | last |
| authorships[1].raw_author_name | Mandrichenko, Igor |
| authorships[1].is_corresponding | False |
| authorships[1].raw_affiliation_strings | Fermilab |
| has_content.pdf | True |
| has_content.grobid_xml | True |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://www.epj-conferences.org/10.1051/epjconf/202024506042/pdf |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-10T00:00:00 |
| display_name | Striped Data Analysis Framework |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T04:12:42.849631 |
| primary_topic.id | https://openalex.org/T11512 |
| primary_topic.field.id | https://openalex.org/fields/17 |
| primary_topic.field.display_name | Computer Science |
| primary_topic.score | 0.13740000128746033 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1702 |
| primary_topic.subfield.display_name | Artificial Intelligence |
| primary_topic.display_name | Anomaly Detection Techniques and Applications |
| related_works | https://openalex.org/W2899084033, https://openalex.org/W2748952813, https://openalex.org/W2390279801, https://openalex.org/W2358668433, https://openalex.org/W2376932109, https://openalex.org/W2382290278, https://openalex.org/W2350741829, https://openalex.org/W2130043461, https://openalex.org/W2530322880, https://openalex.org/W1596801655 |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:edpsciences.org:dkey/10.1051/epjconf/202024506042 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400744 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | False |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | Springer Link (Chiba Institute of Technology) |
| best_oa_location.source.host_organization | https://openalex.org/I8488066 |
| best_oa_location.source.host_organization_name | Chiba Institute of Technology |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I8488066 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | https://www.epj-conferences.org/10.1051/epjconf/202024506042/pdf |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | Text |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | https://doi.org/10.1051/epjconf/202024506042 |
| best_oa_location.landing_page_url | |
| primary_location.id | pmh:oai:edpsciences.org:dkey/10.1051/epjconf/202024506042 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400744 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | False |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | Springer Link (Chiba Institute of Technology) |
| primary_location.source.host_organization | https://openalex.org/I8488066 |
| primary_location.source.host_organization_name | Chiba Institute of Technology |
| primary_location.source.host_organization_lineage | https://openalex.org/I8488066 |
| primary_location.license | cc-by |
| primary_location.pdf_url | https://www.epj-conferences.org/10.1051/epjconf/202024506042/pdf |
| primary_location.version | submittedVersion |
| primary_location.raw_type | Text |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | https://doi.org/10.1051/epjconf/202024506042 |
| primary_location.landing_page_url | |
| publication_date | 2019-11-25 |
| publication_year | 2019 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 82 |
| abstract_inverted_index.In | 216 |
| abstract_inverted_index.To | 99 |
| abstract_inverted_index.We | 52 |
| abstract_inverted_index.an | 54 |
| abstract_inverted_index.as | 111, 179, 200 |
| abstract_inverted_index.by | 18, 89, 141 |
| abstract_inverted_index.in | 14, 81, 190, 210 |
| abstract_inverted_index.is | 5, 44, 70 |
| abstract_inverted_index.of | 28, 75, 94, 114, 124, 133, 145, 151, 224, 237 |
| abstract_inverted_index.on | 7, 63 |
| abstract_inverted_index.or | 26, 213 |
| abstract_inverted_index.to | 49, 57, 71 |
| abstract_inverted_index.we | 104, 219 |
| abstract_inverted_index.Big | 65 |
| abstract_inverted_index.The | 68, 154, 168, 196 |
| abstract_inverted_index.all | 130 |
| abstract_inverted_index.and | 16, 47, 60, 78, 86, 147, 158, 163, 175, 234 |
| abstract_inverted_index.are | 12 |
| abstract_inverted_index.but | 206 |
| abstract_inverted_index.can | 207 |
| abstract_inverted_index.for | 121 |
| abstract_inverted_index.one | 25 |
| abstract_inverted_index.run | 189, 209 |
| abstract_inverted_index.the | 8, 29, 64, 97, 112, 115, 131, 180, 187, 191, 201, 217, 222, 225, 230, 238 |
| abstract_inverted_index.web | 173 |
| abstract_inverted_index.Data | 66, 108, 240 |
| abstract_inverted_index.FNAL | 227 |
| abstract_inverted_index.High | 1 |
| abstract_inverted_index.LDRD | 228 |
| abstract_inverted_index.This | 32 |
| abstract_inverted_index.also | 208 |
| abstract_inverted_index.data | 3, 11, 30, 36, 50, 58, 76, 88, 95, 126, 135, 152, 160, 176 |
| abstract_inverted_index.dist | 149 |
| abstract_inverted_index.each | 23, 73 |
| abstract_inverted_index.from | 96 |
| abstract_inverted_index.goes | 139 |
| abstract_inverted_index.idea | 69 |
| abstract_inverted_index.line | 212 |
| abstract_inverted_index.make | 100 |
| abstract_inverted_index.more | 27 |
| abstract_inverted_index.once | 77, 80 |
| abstract_inverted_index.only | 79, 91 |
| abstract_inverted_index.this | 101 |
| abstract_inverted_index.time | 45 |
| abstract_inverted_index.user | 164, 202 |
| abstract_inverted_index.uses | 170 |
| abstract_inverted_index.will | 220 |
| abstract_inverted_index.While | 128 |
| abstract_inverted_index.based | 6, 62 |
| abstract_inverted_index.basis | 113 |
| abstract_inverted_index.batch | 214 |
| abstract_inverted_index.cloud | 192 |
| abstract_inverted_index.files | 15 |
| abstract_inverted_index.leads | 48 |
| abstract_inverted_index.mode. | 215 |
| abstract_inverted_index.model | 9 |
| abstract_inverted_index.shelf | 172 |
| abstract_inverted_index.step, | 38 |
| abstract_inverted_index.store | 72 |
| abstract_inverted_index.using | 193 |
| abstract_inverted_index.where | 10 |
| abstract_inverted_index.which | 39, 43 |
| abstract_inverted_index.Energy | 2 |
| abstract_inverted_index.Format | 110 |
| abstract_inverted_index.allows | 120, 186 |
| abstract_inverted_index.files, | 42 |
| abstract_inverted_index.files. | 31 |
| abstract_inverted_index.offers | 198 |
| abstract_inverted_index.pieces | 93 |
| abstract_inverted_index.stored | 13 |
| abstract_inverted_index.Striped | 107, 239 |
| abstract_inverted_index.analyze | 87 |
| abstract_inverted_index.backend | 166, 204 |
| abstract_inverted_index.caching | 177 |
| abstract_inverted_index.command | 211 |
| abstract_inverted_index.complex | 125 |
| abstract_inverted_index.compute | 162 |
| abstract_inverted_index.design, | 231 |
| abstract_inverted_index.elastic | 159 |
| abstract_inverted_index.element | 74 |
| abstract_inverted_index.further | 140 |
| abstract_inverted_index.keeping | 129 |
| abstract_inverted_index.off-the | 171 |
| abstract_inverted_index.present | 221 |
| abstract_inverted_index.process | 33 |
| abstract_inverted_index.propose | 53 |
| abstract_inverted_index.reading | 24, 90 |
| abstract_inverted_index.results | 223 |
| abstract_inverted_index.running | 19 |
| abstract_inverted_index.smaller | 41 |
| abstract_inverted_index.storage | 59 |
| abstract_inverted_index.striped | 137 |
| abstract_inverted_index.Analysis | 241 |
| abstract_inverted_index.Flexible | 184 |
| abstract_inverted_index.analysis | 4, 21, 123, 165, 203 |
| abstract_inverted_index.analyzed | 17 |
| abstract_inverted_index.approach | 56, 102, 119 |
| abstract_inverted_index.article, | 218 |
| abstract_inverted_index.benefits | 132 |
| abstract_inverted_index.columnar | 106, 118, 134 |
| abstract_inverted_index.database | 85 |
| abstract_inverted_index.enabling | 142 |
| abstract_inverted_index.features | 233 |
| abstract_inverted_index.flexible | 148 |
| abstract_inverted_index.includes | 156 |
| abstract_inverted_index.involves | 34 |
| abstract_inverted_index.multiple | 20 |
| abstract_inverted_index.produces | 40 |
| abstract_inverted_index.project, | 229 |
| abstract_inverted_index.repeated | 35 |
| abstract_inverted_index.ribution | 150 |
| abstract_inverted_index.scalable | 84, 157 |
| abstract_inverted_index.services | 174 |
| abstract_inverted_index.storage, | 161 |
| abstract_inverted_index.analysis, | 61 |
| abstract_inverted_index.analysis. | 153 |
| abstract_inverted_index.consuming | 46 |
| abstract_inverted_index.container | 194 |
| abstract_inverted_index.database. | 98 |
| abstract_inverted_index.developed | 105 |
| abstract_inverted_index.efficient | 122, 143 |
| abstract_inverted_index.framework | 155, 169, 188, 197 |
| abstract_inverted_index.mechanism | 138 |
| abstract_inverted_index.platform, | 205 |
| abstract_inverted_index.possible, | 103 |
| abstract_inverted_index.reduction | 37 |
| abstract_inverted_index.Framework. | 242 |
| abstract_inverted_index.framework. | 116 |
| abstract_inverted_index.mechanism. | 183 |
| abstract_inverted_index.processes, | 22 |
| abstract_inverted_index.Traditional | 117 |
| abstract_inverted_index.alternative | 55 |
| abstract_inverted_index.co-location | 182 |
| abstract_inverted_index.components. | 167 |
| abstract_inverted_index.distributed | 83 |
| abstract_inverted_index.performance | 235 |
| abstract_inverted_index.structures. | 127 |
| abstract_inverted_index.architecture | 185 |
| abstract_inverted_index.computations | 146 |
| abstract_inverted_index.compute/data | 181 |
| abstract_inverted_index.duplication. | 51 |
| abstract_inverted_index.technologies | 178 |
| abstract_inverted_index."interesting" | 92 |
| abstract_inverted_index.technologies. | 67, 195 |
| abstract_inverted_index.Python/Jupyter | 199 |
| abstract_inverted_index.Representation | 109 |
| abstract_inverted_index.Traditionally, | 0 |
| abstract_inverted_index.characteristics | 236 |
| abstract_inverted_index.implementation, | 232 |
| abstract_inverted_index.parallelization | 144 |
| abstract_inverted_index.representation, | 136 |
| abstract_inverted_index.FNAL-LDRD-2016-032 | 226 |
| cited_by_percentile_year | |
| countries_distinct_count | 1 |
| institutions_distinct_count | 2 |
| citation_normalized_percentile |