Historically, this kind of test optimization was done either with static analysis to understand dependency graphs and/or runtime data collected from executing the app.
However, those methods are tightly bound to programming languages, frameworks, and interpreters so they are difficult to support across technology stacks.
This approach substitutes the intelligence of the LLM to make educated guesses about what tests execute, to achieve the same goal of executing all of the tests that could fail and none of the rest (balancing a precision/recall tradeoff). What’s especially interesting about this to me is that the same technique could be applied to any language or stack with minimal modification.
Has anyone seen LLMs in other contexts being substituted for traditional analysis to achieve language agnostic results?
However, those methods are tightly bound to programming languages, frameworks, and interpreters so they are difficult to support across technology stacks.
This approach substitutes the intelligence of the LLM to make educated guesses about what tests execute, to achieve the same goal of executing all of the tests that could fail and none of the rest (balancing a precision/recall tradeoff). What’s especially interesting about this to me is that the same technique could be applied to any language or stack with minimal modification.
Has anyone seen LLMs in other contexts being substituted for traditional analysis to achieve language agnostic results?