LS-LLM Language Models Speed Up Local Search for Finding Programmatic Policies | Notion

****https://research.tue.nl/en/publications/language-models-speed-up-local-search-for-finding-programmatic-po

TL;DR

We can find good policies (in code-space) using Stochastic Hill-Climbing (SHC) (which in turn uses) LLMs to take those steps.
LLMs benefit from feedback, i.e, roll-out of the policy it generated in the previous iteration.
(Note that unlike our method and like Levi’s most works, they use a DSL and a AST tree to make edits to code)

Methodology