LLM-Powered Intelligent Test Suite Selector for CI Pipelines

dev tool real project •• multiple requests

CI pipelines run full test suites on every commit even when only a small fraction of tests are affected by the change. Developers wait 10-30 minutes for results when 90% of the tests are irrelevant. An HN user specifically requested an LLM that analyzes code changes and proposes relevant test suites with flakiness estimates. Datadog's Test Impact Analysis exists but is enterprise-priced and locked to their platform.

builder note

Coverage-based test selection is old tech. The LLM advantage is semantic understanding: it can read a diff, understand the behavioral change, and predict which tests exercise that behavior even without coverage data. Ship as a GitHub Action that comments on PRs with 'suggested test subset' and confidence scores. Start with a single language (Python or TypeScript) and prove the accuracy before going multi-language.

landscape (3 existing solutions)

Test Impact Analysis is a known concept (coverage-based test selection) but existing implementations are either enterprise-locked (Datadog), ML-dependent requiring months of training data (Launchable), or too simplistic (file-level Git detection). Nobody has shipped an LLM-powered test selector that uses semantic code understanding rather than coverage maps. An LLM can read a diff and understand which behaviors changed, which is fundamentally different from tracking which lines executed.

Datadog Test Impact Analysis Production-ready test selection based on code coverage mapping. But requires Datadog subscription and full CI Visibility integration. Enterprise pricing puts it out of reach for small teams.

Launchable ML-powered test selection that predicts which tests are likely to fail. But commercial SaaS with limited free tier. Requires historical test data to build prediction models.

Jest --changedSince Built-in Git-based test filtering for JavaScript. But limited to file-level detection. Can't determine that a change to a utility function only affects 3 of 50 test files that import it.

sources (2)

hn https://news.ycombinator.com/item?id=46345827 "have LLM analyze changes and propose the set of test suites that is relevant" 2025-12-28

other https://dev.to/barecheck/is-cicd-stifling-innovation-reclaim... "slow pipelines cause developers to start batching pushes" 2026-02-01

CI-CDtestingLLMdeveloper-experienceautomation