iBitter-Stack: A Multi-Representation Ensemble Learning Model for Accurate Bitter Peptide Identification Article Swipe
YOU?
·
· 2025
· Open Access
·
· DOI: https://doi.org/10.48550/arxiv.2505.15730
The identification of bitter peptides is crucial in various domains, including food science, drug discovery, and biochemical research. These peptides not only contribute to the undesirable taste of hydrolyzed proteins but also play key roles in physiological and pharmacological processes. However, experimental methods for identifying bitter peptides are time-consuming and expensive. With the rapid expansion of peptide sequence databases in the post-genomic era, the demand for efficient computational approaches to distinguish bitter from non-bitter peptides has become increasingly significant. In this study, we propose a novel stacking-based ensemble learning framework aimed at enhancing the accuracy and reliability of bitter peptide classification. Our method integrates diverse sequence-based feature representations and leverages a broad set of machine learning classifiers. The first stacking layer comprises multiple base classifiers, each trained on distinct feature encoding schemes, while the second layer employs logistic regression to refine predictions using an eight-dimensional probability vector. Extensive evaluations on a carefully curated dataset demonstrate that our model significantly outperforms existing predictive methods, providing a robust and reliable computational tool for bitter peptide identification. Our approach achieves an accuracy of 96.09\% and a Matthews Correlation Coefficient (MCC) of 0.9220 on the independent test set, underscoring its effectiveness and generalizability. To facilitate real-time usage and broader accessibility, we have also developed a user-friendly web server based on the proposed method, which is freely accessible at https://ibitter-stack-webserver.streamlit.app/. This tool enables researchers and practitioners to conveniently screen peptide sequences for bitterness in real-time applications.
Related Topics
- Type
- preprint
- Language
- en
- Landing Page
- https://doi.org/10.48550/arxiv.2505.15730
- OA Status
- green
- OpenAlex ID
- https://openalex.org/W4415329206
Raw OpenAlex JSON
- OpenAlex ID
-
https://openalex.org/W4415329206Canonical identifier for this work in OpenAlex
- DOI
-
https://doi.org/10.48550/arxiv.2505.15730Digital Object Identifier
- Title
-
iBitter-Stack: A Multi-Representation Ensemble Learning Model for Accurate Bitter Peptide IdentificationWork title
- Type
-
preprintOpenAlex work type
- Language
-
enPrimary language
- Publication year
-
2025Year of publication
- Publication date
-
2025-05-21Full publication date if available
- Authors
-
Sarfraz Ahmad, Momina Ahsan, Muhammad Nabeel Asim, Andreas Dengel, Muhammad Imran MalikList of authors in order
- Landing page
-
https://doi.org/10.48550/arxiv.2505.15730Publisher landing page
- Open access
-
YesWhether a free full text is available
- OA status
-
greenOpen access status per OpenAlex
- OA URL
-
https://doi.org/10.48550/arxiv.2505.15730Direct OA link when available
- Cited by
-
0Total citation count in OpenAlex
Full payload
| id | https://openalex.org/W4415329206 |
|---|---|
| doi | https://doi.org/10.48550/arxiv.2505.15730 |
| ids.doi | https://doi.org/10.48550/arxiv.2505.15730 |
| ids.openalex | https://openalex.org/W4415329206 |
| fwci | |
| type | preprint |
| title | iBitter-Stack: A Multi-Representation Ensemble Learning Model for Accurate Bitter Peptide Identification |
| biblio.issue | |
| biblio.volume | |
| biblio.last_page | |
| biblio.first_page | |
| topics[0].id | https://openalex.org/T12254 |
| topics[0].field.id | https://openalex.org/fields/13 |
| topics[0].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[0].score | 0.9588000178337097 |
| topics[0].domain.id | https://openalex.org/domains/1 |
| topics[0].domain.display_name | Life Sciences |
| topics[0].subfield.id | https://openalex.org/subfields/1312 |
| topics[0].subfield.display_name | Molecular Biology |
| topics[0].display_name | Machine Learning in Bioinformatics |
| topics[1].id | https://openalex.org/T11584 |
| topics[1].field.id | https://openalex.org/fields/29 |
| topics[1].field.display_name | Nursing |
| topics[1].score | 0.9408000111579895 |
| topics[1].domain.id | https://openalex.org/domains/4 |
| topics[1].domain.display_name | Health Sciences |
| topics[1].subfield.id | https://openalex.org/subfields/2916 |
| topics[1].subfield.display_name | Nutrition and Dietetics |
| topics[1].display_name | Biochemical Analysis and Sensing Techniques |
| topics[2].id | https://openalex.org/T13326 |
| topics[2].field.id | https://openalex.org/fields/13 |
| topics[2].field.display_name | Biochemistry, Genetics and Molecular Biology |
| topics[2].score | 0.9014999866485596 |
| topics[2].domain.id | https://openalex.org/domains/1 |
| topics[2].domain.display_name | Life Sciences |
| topics[2].subfield.id | https://openalex.org/subfields/1312 |
| topics[2].subfield.display_name | Molecular Biology |
| topics[2].display_name | Biochemical and Structural Characterization |
| is_xpac | False |
| apc_list | |
| apc_paid | |
| language | en |
| locations[0].id | doi:10.48550/arxiv.2505.15730 |
| locations[0].is_oa | True |
| locations[0].source.id | https://openalex.org/S4306400194 |
| locations[0].source.issn | |
| locations[0].source.type | repository |
| locations[0].source.is_oa | True |
| locations[0].source.issn_l | |
| locations[0].source.is_core | False |
| locations[0].source.is_in_doaj | False |
| locations[0].source.display_name | arXiv (Cornell University) |
| locations[0].source.host_organization | https://openalex.org/I205783295 |
| locations[0].source.host_organization_name | Cornell University |
| locations[0].source.host_organization_lineage | https://openalex.org/I205783295 |
| locations[0].license | cc-by |
| locations[0].pdf_url | |
| locations[0].version | |
| locations[0].raw_type | article |
| locations[0].license_id | https://openalex.org/licenses/cc-by |
| locations[0].is_accepted | False |
| locations[0].is_published | |
| locations[0].raw_source_name | |
| locations[0].landing_page_url | https://doi.org/10.48550/arxiv.2505.15730 |
| indexed_in | datacite |
| authorships[0].author.id | https://openalex.org/A5026976984 |
| authorships[0].author.orcid | https://orcid.org/0000-0003-0059-477X |
| authorships[0].author.display_name | Sarfraz Ahmad |
| authorships[0].author_position | first |
| authorships[0].raw_author_name | Ahmad, Sarfraz |
| authorships[0].is_corresponding | False |
| authorships[1].author.id | https://openalex.org/A5104157672 |
| authorships[1].author.orcid | |
| authorships[1].author.display_name | Momina Ahsan |
| authorships[1].author_position | middle |
| authorships[1].raw_author_name | Ahsan, Momina |
| authorships[1].is_corresponding | False |
| authorships[2].author.id | https://openalex.org/A5068324429 |
| authorships[2].author.orcid | https://orcid.org/0000-0001-5507-198X |
| authorships[2].author.display_name | Muhammad Nabeel Asim |
| authorships[2].author_position | middle |
| authorships[2].raw_author_name | Asim, Muhammad Nabeel |
| authorships[2].is_corresponding | False |
| authorships[3].author.id | https://openalex.org/A5101904182 |
| authorships[3].author.orcid | https://orcid.org/0000-0002-6100-8255 |
| authorships[3].author.display_name | Andreas Dengel |
| authorships[3].author_position | middle |
| authorships[3].raw_author_name | Dengel, Andreas |
| authorships[3].is_corresponding | False |
| authorships[4].author.id | https://openalex.org/A5103159827 |
| authorships[4].author.orcid | https://orcid.org/0000-0002-8079-5119 |
| authorships[4].author.display_name | Muhammad Imran Malik |
| authorships[4].author_position | last |
| authorships[4].raw_author_name | Malik, Muhammad Imran |
| authorships[4].is_corresponding | False |
| has_content.pdf | False |
| has_content.grobid_xml | False |
| is_paratext | False |
| open_access.is_oa | True |
| open_access.oa_url | https://doi.org/10.48550/arxiv.2505.15730 |
| open_access.oa_status | green |
| open_access.any_repository_has_fulltext | False |
| created_date | 2025-10-19T00:00:00 |
| display_name | iBitter-Stack: A Multi-Representation Ensemble Learning Model for Accurate Bitter Peptide Identification |
| has_fulltext | False |
| is_retracted | False |
| updated_date | 2025-11-06T06:51:31.235846 |
| primary_topic.id | https://openalex.org/T12254 |
| primary_topic.field.id | https://openalex.org/fields/13 |
| primary_topic.field.display_name | Biochemistry, Genetics and Molecular Biology |
| primary_topic.score | 0.9588000178337097 |
| primary_topic.domain.id | https://openalex.org/domains/1 |
| primary_topic.domain.display_name | Life Sciences |
| primary_topic.subfield.id | https://openalex.org/subfields/1312 |
| primary_topic.subfield.display_name | Molecular Biology |
| primary_topic.display_name | Machine Learning in Bioinformatics |
| cited_by_count | 0 |
| locations_count | 1 |
| best_oa_location.id | doi:10.48550/arxiv.2505.15730 |
| best_oa_location.is_oa | True |
| best_oa_location.source.id | https://openalex.org/S4306400194 |
| best_oa_location.source.issn | |
| best_oa_location.source.type | repository |
| best_oa_location.source.is_oa | True |
| best_oa_location.source.issn_l | |
| best_oa_location.source.is_core | False |
| best_oa_location.source.is_in_doaj | False |
| best_oa_location.source.display_name | arXiv (Cornell University) |
| best_oa_location.source.host_organization | https://openalex.org/I205783295 |
| best_oa_location.source.host_organization_name | Cornell University |
| best_oa_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| best_oa_location.license | cc-by |
| best_oa_location.pdf_url | |
| best_oa_location.version | |
| best_oa_location.raw_type | article |
| best_oa_location.license_id | https://openalex.org/licenses/cc-by |
| best_oa_location.is_accepted | False |
| best_oa_location.is_published | False |
| best_oa_location.raw_source_name | |
| best_oa_location.landing_page_url | https://doi.org/10.48550/arxiv.2505.15730 |
| primary_location.id | doi:10.48550/arxiv.2505.15730 |
| primary_location.is_oa | True |
| primary_location.source.id | https://openalex.org/S4306400194 |
| primary_location.source.issn | |
| primary_location.source.type | repository |
| primary_location.source.is_oa | True |
| primary_location.source.issn_l | |
| primary_location.source.is_core | False |
| primary_location.source.is_in_doaj | False |
| primary_location.source.display_name | arXiv (Cornell University) |
| primary_location.source.host_organization | https://openalex.org/I205783295 |
| primary_location.source.host_organization_name | Cornell University |
| primary_location.source.host_organization_lineage | https://openalex.org/I205783295 |
| primary_location.license | cc-by |
| primary_location.pdf_url | |
| primary_location.version | |
| primary_location.raw_type | article |
| primary_location.license_id | https://openalex.org/licenses/cc-by |
| primary_location.is_accepted | False |
| primary_location.is_published | False |
| primary_location.raw_source_name | |
| primary_location.landing_page_url | https://doi.org/10.48550/arxiv.2505.15730 |
| publication_date | 2025-05-21 |
| publication_year | 2025 |
| referenced_works_count | 0 |
| abstract_inverted_index.a | 84, 110, 150, 164, 182, 210 |
| abstract_inverted_index.In | 79 |
| abstract_inverted_index.To | 199 |
| abstract_inverted_index.an | 143, 177 |
| abstract_inverted_index.at | 91, 223 |
| abstract_inverted_index.in | 7, 35, 59, 238 |
| abstract_inverted_index.is | 5, 220 |
| abstract_inverted_index.of | 2, 27, 55, 97, 113, 179, 187 |
| abstract_inverted_index.on | 127, 149, 189, 215 |
| abstract_inverted_index.to | 23, 69, 139, 231 |
| abstract_inverted_index.we | 82, 206 |
| abstract_inverted_index.Our | 101, 174 |
| abstract_inverted_index.The | 0, 117 |
| abstract_inverted_index.and | 15, 37, 49, 95, 108, 166, 181, 197, 203, 229 |
| abstract_inverted_index.are | 47 |
| abstract_inverted_index.but | 30 |
| abstract_inverted_index.for | 43, 65, 170, 236 |
| abstract_inverted_index.has | 75 |
| abstract_inverted_index.its | 195 |
| abstract_inverted_index.key | 33 |
| abstract_inverted_index.not | 20 |
| abstract_inverted_index.our | 156 |
| abstract_inverted_index.set | 112 |
| abstract_inverted_index.the | 24, 52, 60, 63, 93, 133, 190, 216 |
| abstract_inverted_index.web | 212 |
| abstract_inverted_index.This | 225 |
| abstract_inverted_index.With | 51 |
| abstract_inverted_index.also | 31, 208 |
| abstract_inverted_index.base | 123 |
| abstract_inverted_index.drug | 13 |
| abstract_inverted_index.each | 125 |
| abstract_inverted_index.era, | 62 |
| abstract_inverted_index.food | 11 |
| abstract_inverted_index.from | 72 |
| abstract_inverted_index.have | 207 |
| abstract_inverted_index.only | 21 |
| abstract_inverted_index.play | 32 |
| abstract_inverted_index.set, | 193 |
| abstract_inverted_index.test | 192 |
| abstract_inverted_index.that | 155 |
| abstract_inverted_index.this | 80 |
| abstract_inverted_index.tool | 169, 226 |
| abstract_inverted_index.(MCC) | 186 |
| abstract_inverted_index.These | 18 |
| abstract_inverted_index.aimed | 90 |
| abstract_inverted_index.based | 214 |
| abstract_inverted_index.broad | 111 |
| abstract_inverted_index.first | 118 |
| abstract_inverted_index.layer | 120, 135 |
| abstract_inverted_index.model | 157 |
| abstract_inverted_index.novel | 85 |
| abstract_inverted_index.rapid | 53 |
| abstract_inverted_index.roles | 34 |
| abstract_inverted_index.taste | 26 |
| abstract_inverted_index.usage | 202 |
| abstract_inverted_index.using | 142 |
| abstract_inverted_index.which | 219 |
| abstract_inverted_index.while | 132 |
| abstract_inverted_index.0.9220 | 188 |
| abstract_inverted_index.become | 76 |
| abstract_inverted_index.bitter | 3, 45, 71, 98, 171 |
| abstract_inverted_index.demand | 64 |
| abstract_inverted_index.freely | 221 |
| abstract_inverted_index.method | 102 |
| abstract_inverted_index.refine | 140 |
| abstract_inverted_index.robust | 165 |
| abstract_inverted_index.screen | 233 |
| abstract_inverted_index.second | 134 |
| abstract_inverted_index.server | 213 |
| abstract_inverted_index.study, | 81 |
| abstract_inverted_index.96.09\% | 180 |
| abstract_inverted_index.broader | 204 |
| abstract_inverted_index.crucial | 6 |
| abstract_inverted_index.curated | 152 |
| abstract_inverted_index.dataset | 153 |
| abstract_inverted_index.diverse | 104 |
| abstract_inverted_index.employs | 136 |
| abstract_inverted_index.enables | 227 |
| abstract_inverted_index.feature | 106, 129 |
| abstract_inverted_index.machine | 114 |
| abstract_inverted_index.method, | 218 |
| abstract_inverted_index.methods | 42 |
| abstract_inverted_index.peptide | 56, 99, 172, 234 |
| abstract_inverted_index.propose | 83 |
| abstract_inverted_index.trained | 126 |
| abstract_inverted_index.various | 8 |
| abstract_inverted_index.vector. | 146 |
| abstract_inverted_index.However, | 40 |
| abstract_inverted_index.Matthews | 183 |
| abstract_inverted_index.accuracy | 94, 178 |
| abstract_inverted_index.achieves | 176 |
| abstract_inverted_index.approach | 175 |
| abstract_inverted_index.distinct | 128 |
| abstract_inverted_index.domains, | 9 |
| abstract_inverted_index.encoding | 130 |
| abstract_inverted_index.ensemble | 87 |
| abstract_inverted_index.existing | 160 |
| abstract_inverted_index.learning | 88, 115 |
| abstract_inverted_index.logistic | 137 |
| abstract_inverted_index.methods, | 162 |
| abstract_inverted_index.multiple | 122 |
| abstract_inverted_index.peptides | 4, 19, 46, 74 |
| abstract_inverted_index.proposed | 217 |
| abstract_inverted_index.proteins | 29 |
| abstract_inverted_index.reliable | 167 |
| abstract_inverted_index.schemes, | 131 |
| abstract_inverted_index.science, | 12 |
| abstract_inverted_index.sequence | 57 |
| abstract_inverted_index.stacking | 119 |
| abstract_inverted_index.Extensive | 147 |
| abstract_inverted_index.carefully | 151 |
| abstract_inverted_index.comprises | 121 |
| abstract_inverted_index.databases | 58 |
| abstract_inverted_index.developed | 209 |
| abstract_inverted_index.efficient | 66 |
| abstract_inverted_index.enhancing | 92 |
| abstract_inverted_index.expansion | 54 |
| abstract_inverted_index.framework | 89 |
| abstract_inverted_index.including | 10 |
| abstract_inverted_index.leverages | 109 |
| abstract_inverted_index.providing | 163 |
| abstract_inverted_index.real-time | 201, 239 |
| abstract_inverted_index.research. | 17 |
| abstract_inverted_index.sequences | 235 |
| abstract_inverted_index.accessible | 222 |
| abstract_inverted_index.approaches | 68 |
| abstract_inverted_index.bitterness | 237 |
| abstract_inverted_index.contribute | 22 |
| abstract_inverted_index.discovery, | 14 |
| abstract_inverted_index.expensive. | 50 |
| abstract_inverted_index.facilitate | 200 |
| abstract_inverted_index.hydrolyzed | 28 |
| abstract_inverted_index.integrates | 103 |
| abstract_inverted_index.non-bitter | 73 |
| abstract_inverted_index.predictive | 161 |
| abstract_inverted_index.processes. | 39 |
| abstract_inverted_index.regression | 138 |
| abstract_inverted_index.Coefficient | 185 |
| abstract_inverted_index.Correlation | 184 |
| abstract_inverted_index.biochemical | 16 |
| abstract_inverted_index.demonstrate | 154 |
| abstract_inverted_index.distinguish | 70 |
| abstract_inverted_index.evaluations | 148 |
| abstract_inverted_index.identifying | 44 |
| abstract_inverted_index.independent | 191 |
| abstract_inverted_index.outperforms | 159 |
| abstract_inverted_index.predictions | 141 |
| abstract_inverted_index.probability | 145 |
| abstract_inverted_index.reliability | 96 |
| abstract_inverted_index.researchers | 228 |
| abstract_inverted_index.undesirable | 25 |
| abstract_inverted_index.classifiers, | 124 |
| abstract_inverted_index.classifiers. | 116 |
| abstract_inverted_index.conveniently | 232 |
| abstract_inverted_index.experimental | 41 |
| abstract_inverted_index.increasingly | 77 |
| abstract_inverted_index.post-genomic | 61 |
| abstract_inverted_index.significant. | 78 |
| abstract_inverted_index.underscoring | 194 |
| abstract_inverted_index.applications. | 240 |
| abstract_inverted_index.computational | 67, 168 |
| abstract_inverted_index.effectiveness | 196 |
| abstract_inverted_index.physiological | 36 |
| abstract_inverted_index.practitioners | 230 |
| abstract_inverted_index.significantly | 158 |
| abstract_inverted_index.user-friendly | 211 |
| abstract_inverted_index.accessibility, | 205 |
| abstract_inverted_index.identification | 1 |
| abstract_inverted_index.sequence-based | 105 |
| abstract_inverted_index.stacking-based | 86 |
| abstract_inverted_index.time-consuming | 48 |
| abstract_inverted_index.classification. | 100 |
| abstract_inverted_index.identification. | 173 |
| abstract_inverted_index.pharmacological | 38 |
| abstract_inverted_index.representations | 107 |
| abstract_inverted_index.eight-dimensional | 144 |
| abstract_inverted_index.generalizability. | 198 |
| abstract_inverted_index.https://ibitter-stack-webserver.streamlit.app/. | 224 |
| cited_by_percentile_year | |
| countries_distinct_count | 0 |
| institutions_distinct_count | 5 |
| citation_normalized_percentile |