Flaky Test Auto-Detection and Quarantine for Small Engineering Teams

dev tool real project •• multiple requests

Flaky tests waste 6-8 hours of engineering time per week and the problem is getting worse, growing from 10% of teams affected in 2022 to 26% in 2025. Enterprise tools like Trunk target large orgs with complex CI. Small teams under 20 devs need affordable, drop-in flaky test detection that quarantines bad tests without requiring a platform engineering team.

builder note

Ship a GitHub Action that ingests JUnit XML reports, builds a flakiness score per test over time, and auto-adds a [quarantine] label. Free for public repos, $9/mo for private. The detection algorithm is straightforward. The moat is being the easiest thing to install.

landscape (3 existing solutions)

Enterprise teams build internal tools like Atlassian's Flakinator. Small teams either suffer or ignore the problem. BuildPulse is the closest small-team option but the space lacks a free-tier, open-source, GitHub-Actions-native flaky test detector that auto-quarantines without configuration.

BuildPulse Small-team friendly but focused narrowly on detection and reporting. No auto-fix suggestions. Pricing not transparent on site.
Trunk Tailored for large-scale enterprises with complex CI/CD. Overkill and overpriced for a 5-15 person team.
TestDino Newer entrant at $468-748/year for 10 users. AI failure classification is promising but adoption is limited. Playwright-native focus narrows the audience.

sources (3)

other https://testdino.com/blog/flaky-test-benchmark/ "proportion of teams experiencing test flakiness grew from 10% to 26%" 2026-03-01
other https://www.atlassian.com/blog/atlassian-engineering/taming-... "retry-based and Bayesian detection with automated quarantine" 2026-02-15
other https://buildpulse.io/ "find, quarantine, and fix flaky tests instantly" 2026-04-01
testingCI/CDflaky testsdeveloper productivityGitHub Actions