Symbolic Regression with a Learned Concept Library

Arya Grayeli , Atharva Sehgal , Omar Costilla-Reyes , Miles Cranmer , Swarat Chaudhuri ·

YOU? · · 2024 · Open Access · · DOI: https://doi.org/10.48550/arxiv.2409.09359

We present a novel method for symbolic regression (SR), the task of searching for compact programmatic hypotheses that best explain a dataset. The problem is commonly solved using genetic algorithms; we show that we can enhance such methods by inducing a library of abstract textual concepts. Our algorithm, called LaSR, uses zero-shot queries to a large language model (LLM) to discover and evolve concepts occurring in known high-performing hypotheses. We discover new hypotheses using a mix of standard evolutionary steps and LLM-guided steps (obtained through zero-shot LLM queries) conditioned on discovered concepts. Once discovered, hypotheses are used in a new round of concept abstraction and evolution. We validate LaSR on the Feynman equations, a popular SR benchmark, as well as a set of synthetic tasks. On these benchmarks, LaSR substantially outperforms a variety of state-of-the-art SR approaches based on deep learning and evolutionary algorithms. Moreover, we show that LaSR can be used to discover a novel and powerful scaling law for LLMs.

Concepts

Computer science Regression The Symbolic Regression analysis Artificial intelligence Psychology Statistics Machine learning Mathematics Psychoanalysis

Metadata

Type: preprint
Language: en
Landing Page: http://arxiv.org/abs/2409.09359
PDF: https://arxiv.org/pdf/2409.09359
OA Status: green
Related Works: 10
OpenAlex ID: https://openalex.org/W4403667156

All OpenAlex metadata

Raw OpenAlex JSON

No additional metadata available.

Symbolic Regression with a Learned Concept Library Article Swipe

Related Topics To Compare & Contrast

Raw OpenAlex JSON