Out-of-Sample Hydrocarbon Production Forecasting: Time Series Machine Learning using Productivity Index-Driven Features and Inductive Conformal Prediction Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2508.14078
This research introduces a new ML framework designed to enhance the robustness of out-of-sample hydrocarbon production forecasting, specifically addressing multivariate time series analysis. The proposed methodology integrates Productivity Index (PI)-driven feature selection, a concept derived from reservoir engineering, with Inductive Conformal Prediction (ICP) for rigorous uncertainty quantification. Utilizing historical data from the Volve (wells PF14, PF12) and Norne (well E1H) oil fields, this study investigates the efficacy of various predictive algorithms-namely Long Short-Term Memory (LSTM), Bidirectional LSTM (BiLSTM), Gated Recurrent Unit (GRU), and eXtreme Gradient Boosting (XGBoost) - in forecasting historical oil production rates (OPR_H). All the models achieved "out-of-sample" production forecasts for an upcoming future timeframe. Model performance was comprehensively evaluated using traditional error metrics (e.g., MAE) supplemented by Forecast Bias and Prediction Direction Accuracy (PDA) to assess bias and trend-capturing capabilities. The PI-based feature selection effectively reduced input dimensionality compared to conventional numerical simulation workflows. The uncertainty quantification was addressed using the ICP framework, a distribution-free approach that guarantees valid prediction intervals (e.g., 95% coverage) without reliance on distributional assumptions, offering a distinct advantage over traditional confidence intervals, particularly for complex, non-normal data. Results demonstrated the superior performance of the LSTM model, achieving the lowest MAE on test (19.468) and genuine out-of-sample forecast data (29.638) for well PF14, with subsequent validation on Norne well E1H. These findings highlight the significant potential of combining domain-specific knowledge with advanced ML techniques to improve the reliability of hydrocarbon production forecasts.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- http://arxiv.org/abs/2508.14078
- https://arxiv.org/pdf/2508.14078
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415238128
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415238128Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2508.14078Digital Object Identifier
- Title
-
Out-of-Sample Hydrocarbon Production Forecasting: Time Series Machine Learning using Productivity Index-Driven Features and Inductive Conformal PredictionWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-08-12Full publication date if available
- Authors
-
Marislinda Idris, Jakub Marek Cebula, Jebraeel Gholinezhad, Shamsul Masum, Hongjie MaList of authors in order
- Landing page
-
https://arxiv.org/abs/2508.14078Publisher landing page
- PDF URL
-
https://arxiv.org/pdf/2508.14078Direct link to full text PDF
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://arxiv.org/pdf/2508.14078Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415238128 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2508.14078 |
| ids.doi | https://doi.org/10.48550/arxiv.2508.14078 |
| ids.openalex | https://openalex.org/W4415238128 |
| fwci | |
| type | preprint |
| title | Out-of-Sample Hydrocarbon Production Forecasting: Time Series Machine Learning using Productivity Index-Driven Features and Inductive Conformal Prediction |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T11801 |
| topics[0].field.id | https://openalex.org/fields/22 |
| topics[0].field.display_name | Engineering |
| topics[0].score | 0.9682999849319458 |
| topics[0].domain.id | https://openalex.org/domains/3 |
| topics[0].domain.display_name | Physical Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/2212 |
| topics[0].subfield.display_name | Ocean Engineering |
| topics[0].display_name | Reservoir Engineering and Simulation Methods |
| topics[1].id | https://openalex.org/T12157 |
| topics[1].field.id | https://openalex.org/fields/17 |
| topics[1].field.display_name | Computer Science |
| topics[1].score | 0.9635999798774719 |
| topics[1].domain.id | https://openalex.org/domains/3 |
| topics[1].domain.display_name | Physical Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/1702 |
| topics[1].subfield.display_name | Artificial Intelligence |
| topics[1].display_name | Geochemistry and Geologic Mapping |
| topics[2].id | https://openalex.org/T10399 |
| topics[2].field.id | https://openalex.org/fields/22 |
| topics[2].field.display_name | Engineering |
| topics[2].score | 0.9409000277519226 |
| topics[2].domain.id | https://openalex.org/domains/3 |
| topics[2].domain.display_name | Physical Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/2211 |
| topics[2].subfield.display_name | Mechanics of Materials |
| topics[2].display_name | Hydrocarbon exploration and reservoir analysis |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | pmh:oai:arXiv.org:2508.14078 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | |
| locations[0].pdf_url | https://arxiv.org/pdf/2508.14078 |
| locations[0].version | submittedVersion |
| locations[0].raw_type | text |
| locations[0].license_id | |
| locations[0].is_accepted | False |
| locations[0].is_published | False |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | http://arxiv.org/abs/2508.14078 |
| locations[1].id | doi:10.48550/arxiv.2508.14078 |
| locations[1].is_oa | True |
| locations[1].source.id | https://openalex.org/S4306400194 |
| locations[1].source.issn | |
| locations[1].source.type | repository |
| locations[1].source.is_oa | True |
| locations[1].source.issn_l | |
| locations[1].source.is_core | False |
| locations[1].source.is_in_doaj | False |
| locations[1].source.display_name | arXiv (Cornell University) |
| locations[1].source.host_organization | https://openalex.org/I205783295 |
| locations[1].source.host_organization_name | Cornell University |
| locations[1].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[1].license | |
| locations[1].pdf_url | |
| locations[1].version | |
| locations[1].raw_type | article |
| locations[1].license_id | |
| locations[1].is_accepted | False |
| locations[1].is_published | |
| locations[1].raw_source_name | |
| locations[1].landing_page_url | https://doi.org/10.48550/arxiv.2508.14078 |
| indexed_in | arxiv, datacite |
| authorships[0].author.id | https://openalex.org/A5051725231 |
| authorships[0].author.orcid | |
| authorships[0].author.display_name | Marislinda Idris |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Idris, Mohamed Hassan Abdalla |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5120021242 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Jakub Marek Cebula |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Cebula, Jakub Marek |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5020773163 |
| authorships[2].author.orcid | https://orcid.org/0000-0002-0495-9812 |
| authorships[2].author.display_name | Jebraeel Gholinezhad |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Gholinezhad, Jebraeel |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5040945356 |
| authorships[3].author.orcid | https://orcid.org/0000-0001-8489-9356 |
| authorships[3].author.display_name | Shamsul Masum |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Masum, Shamsul |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5014508923 |
| authorships[4].author.orcid | https://orcid.org/0000-0001-8507-3636 |
| authorships[4].author.display_name | Hongjie Ma |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Ma, Hongjie |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://arxiv.org/pdf/2508.14078 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-16T00:00:00 |
| display_name | Out-of-Sample Hydrocarbon Production Forecasting: Time Series Machine Learning using Productivity Index-Driven Features and Inductive Conformal Prediction |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T11801 |
| primary_topic.field.id | https://openalex.org/fields/22 |
| primary_topic.field.display_name | Engineering |
| primary_topic.score | 0.9682999849319458 |
| primary_topic.domain.id | https://openalex.org/domains/3 |
| primary_topic.domain.display_name | Physical Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/2212 |
| primary_topic.subfield.display_name | Ocean Engineering |
| primary_topic.display_name | Reservoir Engineering and Simulation Methods |
| cited_by_count | 0 |
| locations_count | 2 |
| best_oa_location.id | pmh:oai:arXiv.org:2508.14078 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | |
| best_oa_location.pdf_url | https://arxiv.org/pdf/2508.14078 |
| best_oa_location.version | submittedVersion |
| best_oa_location.raw_type | text |
| best_oa_location.license_id | |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | http://arxiv.org/abs/2508.14078 |
| primary_location.id | pmh:oai:arXiv.org:2508.14078 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | |
| primary_location.pdf_url | https://arxiv.org/pdf/2508.14078 |
| primary_location.version | submittedVersion |
| primary_location.raw_type | text |
| primary_location.license_id | |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | http://arxiv.org/abs/2508.14078 |
| publication_date | 2025-08-12 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.- | 87 |
| abstract_inverted_index.a | 3, 32, 156, 173 |
| abstract_inverted_index.ML | 5, 229 |
| abstract_inverted_index.an | 103 |
| abstract_inverted_index.by | 119 |
| abstract_inverted_index.in | 88 |
| abstract_inverted_index.of | 12, 67, 190, 223, 235 |
| abstract_inverted_index.on | 169, 198, 213 |
| abstract_inverted_index.to | 8, 127, 142, 231 |
| abstract_inverted_index.95% | 165 |
| abstract_inverted_index.All | 95 |
| abstract_inverted_index.ICP | 154 |
| abstract_inverted_index.MAE | 197 |
| abstract_inverted_index.The | 23, 133, 147 |
| abstract_inverted_index.and | 56, 82, 122, 130, 201 |
| abstract_inverted_index.for | 43, 102, 181, 207 |
| abstract_inverted_index.new | 4 |
| abstract_inverted_index.oil | 60, 91 |
| abstract_inverted_index.the | 10, 51, 65, 96, 153, 187, 191, 195, 220, 233 |
| abstract_inverted_index.was | 109, 150 |
| abstract_inverted_index.Bias | 121 |
| abstract_inverted_index.E1H) | 59 |
| abstract_inverted_index.E1H. | 216 |
| abstract_inverted_index.LSTM | 76, 192 |
| abstract_inverted_index.Long | 71 |
| abstract_inverted_index.MAE) | 117 |
| abstract_inverted_index.This | 0 |
| abstract_inverted_index.Unit | 80 |
| abstract_inverted_index.bias | 129 |
| abstract_inverted_index.data | 49, 205 |
| abstract_inverted_index.from | 35, 50 |
| abstract_inverted_index.over | 176 |
| abstract_inverted_index.test | 199 |
| abstract_inverted_index.that | 159 |
| abstract_inverted_index.this | 62 |
| abstract_inverted_index.time | 20 |
| abstract_inverted_index.well | 208, 215 |
| abstract_inverted_index.with | 38, 210, 227 |
| abstract_inverted_index.(ICP) | 42 |
| abstract_inverted_index.(PDA) | 126 |
| abstract_inverted_index.(well | 58 |
| abstract_inverted_index.Gated | 78 |
| abstract_inverted_index.Index | 28 |
| abstract_inverted_index.Model | 107 |
| abstract_inverted_index.Norne | 57, 214 |
| abstract_inverted_index.PF12) | 55 |
| abstract_inverted_index.PF14, | 54, 209 |
| abstract_inverted_index.These | 217 |
| abstract_inverted_index.Volve | 52 |
| abstract_inverted_index.data. | 184 |
| abstract_inverted_index.error | 114 |
| abstract_inverted_index.input | 139 |
| abstract_inverted_index.rates | 93 |
| abstract_inverted_index.study | 63 |
| abstract_inverted_index.using | 112, 152 |
| abstract_inverted_index.valid | 161 |
| abstract_inverted_index.(GRU), | 81 |
| abstract_inverted_index.(e.g., | 116, 164 |
| abstract_inverted_index.(wells | 53 |
| abstract_inverted_index.Memory | 73 |
| abstract_inverted_index.assess | 128 |
| abstract_inverted_index.future | 105 |
| abstract_inverted_index.lowest | 196 |
| abstract_inverted_index.model, | 193 |
| abstract_inverted_index.models | 97 |
| abstract_inverted_index.series | 21 |
| abstract_inverted_index.(LSTM), | 74 |
| abstract_inverted_index.Results | 185 |
| abstract_inverted_index.concept | 33 |
| abstract_inverted_index.derived | 34 |
| abstract_inverted_index.eXtreme | 83 |
| abstract_inverted_index.enhance | 9 |
| abstract_inverted_index.feature | 30, 135 |
| abstract_inverted_index.fields, | 61 |
| abstract_inverted_index.genuine | 202 |
| abstract_inverted_index.improve | 232 |
| abstract_inverted_index.metrics | 115 |
| abstract_inverted_index.reduced | 138 |
| abstract_inverted_index.various | 68 |
| abstract_inverted_index.without | 167 |
| abstract_inverted_index.(19.468) | 200 |
| abstract_inverted_index.(29.638) | 206 |
| abstract_inverted_index.(OPR_H). | 94 |
| abstract_inverted_index.Accuracy | 125 |
| abstract_inverted_index.Boosting | 85 |
| abstract_inverted_index.Forecast | 120 |
| abstract_inverted_index.Gradient | 84 |
| abstract_inverted_index.PI-based | 134 |
| abstract_inverted_index.achieved | 98 |
| abstract_inverted_index.advanced | 228 |
| abstract_inverted_index.approach | 158 |
| abstract_inverted_index.compared | 141 |
| abstract_inverted_index.complex, | 182 |
| abstract_inverted_index.designed | 7 |
| abstract_inverted_index.distinct | 174 |
| abstract_inverted_index.efficacy | 66 |
| abstract_inverted_index.findings | 218 |
| abstract_inverted_index.forecast | 204 |
| abstract_inverted_index.offering | 172 |
| abstract_inverted_index.proposed | 24 |
| abstract_inverted_index.reliance | 168 |
| abstract_inverted_index.research | 1 |
| abstract_inverted_index.rigorous | 44 |
| abstract_inverted_index.superior | 188 |
| abstract_inverted_index.upcoming | 104 |
| abstract_inverted_index.(BiLSTM), | 77 |
| abstract_inverted_index.(XGBoost) | 86 |
| abstract_inverted_index.Conformal | 40 |
| abstract_inverted_index.Direction | 124 |
| abstract_inverted_index.Inductive | 39 |
| abstract_inverted_index.Recurrent | 79 |
| abstract_inverted_index.Utilizing | 47 |
| abstract_inverted_index.achieving | 194 |
| abstract_inverted_index.addressed | 151 |
| abstract_inverted_index.advantage | 175 |
| abstract_inverted_index.analysis. | 22 |
| abstract_inverted_index.combining | 224 |
| abstract_inverted_index.coverage) | 166 |
| abstract_inverted_index.evaluated | 111 |
| abstract_inverted_index.forecasts | 101 |
| abstract_inverted_index.framework | 6 |
| abstract_inverted_index.highlight | 219 |
| abstract_inverted_index.intervals | 163 |
| abstract_inverted_index.knowledge | 226 |
| abstract_inverted_index.numerical | 144 |
| abstract_inverted_index.potential | 222 |
| abstract_inverted_index.reservoir | 36 |
| abstract_inverted_index.selection | 136 |
| abstract_inverted_index.Prediction | 41, 123 |
| abstract_inverted_index.Short-Term | 72 |
| abstract_inverted_index.addressing | 18 |
| abstract_inverted_index.confidence | 178 |
| abstract_inverted_index.forecasts. | 238 |
| abstract_inverted_index.framework, | 155 |
| abstract_inverted_index.guarantees | 160 |
| abstract_inverted_index.historical | 48, 90 |
| abstract_inverted_index.integrates | 26 |
| abstract_inverted_index.intervals, | 179 |
| abstract_inverted_index.introduces | 2 |
| abstract_inverted_index.non-normal | 183 |
| abstract_inverted_index.prediction | 162 |
| abstract_inverted_index.predictive | 69 |
| abstract_inverted_index.production | 15, 92, 100, 237 |
| abstract_inverted_index.robustness | 11 |
| abstract_inverted_index.selection, | 31 |
| abstract_inverted_index.simulation | 145 |
| abstract_inverted_index.subsequent | 211 |
| abstract_inverted_index.techniques | 230 |
| abstract_inverted_index.timeframe. | 106 |
| abstract_inverted_index.validation | 212 |
| abstract_inverted_index.workflows. | 146 |
| abstract_inverted_index.(PI)-driven | 29 |
| abstract_inverted_index.effectively | 137 |
| abstract_inverted_index.forecasting | 89 |
| abstract_inverted_index.hydrocarbon | 14, 236 |
| abstract_inverted_index.methodology | 25 |
| abstract_inverted_index.performance | 108, 189 |
| abstract_inverted_index.reliability | 234 |
| abstract_inverted_index.significant | 221 |
| abstract_inverted_index.traditional | 113, 177 |
| abstract_inverted_index.uncertainty | 45, 148 |
| abstract_inverted_index.Productivity | 27 |
| abstract_inverted_index.assumptions, | 171 |
| abstract_inverted_index.conventional | 143 |
| abstract_inverted_index.demonstrated | 186 |
| abstract_inverted_index.engineering, | 37 |
| abstract_inverted_index.forecasting, | 16 |
| abstract_inverted_index.investigates | 64 |
| abstract_inverted_index.multivariate | 19 |
| abstract_inverted_index.particularly | 180 |
| abstract_inverted_index.specifically | 17 |
| abstract_inverted_index.supplemented | 118 |
| abstract_inverted_index.Bidirectional | 75 |
| abstract_inverted_index.capabilities. | 132 |
| abstract_inverted_index.out-of-sample | 13, 203 |
| abstract_inverted_index.dimensionality | 140 |
| abstract_inverted_index.distributional | 170 |
| abstract_inverted_index.quantification | 149 |
| abstract_inverted_index."out-of-sample" | 99 |
| abstract_inverted_index.comprehensively | 110 |
| abstract_inverted_index.domain-specific | 225 |
| abstract_inverted_index.quantification. | 46 |
| abstract_inverted_index.trend-capturing | 131 |
| abstract_inverted_index.algorithms-namely | 70 |
| abstract_inverted_index.distribution-free | 157 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |