Self-Hostable Bookmark-and-Full-Page-Archiver That Captures Reddit Threads Before They Vanish Behind the 2026 Paywall
Reddit confirmed paywalled subreddits are coming this year (CEO Steve Huffman, late 2025) and admins keep tightening API and search access. Self-hosters who use bookmark-everything tools (Karakeep, Linkwarden, Wallabag) are running into the same wall: snapshotting a Reddit thread today returns 'just a small blurb' or an empty shell because Reddit's mobile-web layout strips comment trees behind a 'see more' button. Demand is for a self-hosted archiver that uses a real-browser engine (Playwright/Chromium) plus Reddit-specific tree expansion, captures the full comment tree to a single static HTML, and can replay archived threads when the original goes paywall-locked or 404.
The unsexy play is being a Karakeep plugin, not a competing app. Ship a 'site adapter pack' (Reddit, Twitter, Substack, Hacker News) that drops into Karakeep/Linkwarden via their plugin or sidecar API. Adapter packs as a recurring product. Open-source the engine, charge for the maintained adapter set as a $3/mo signal that pays for the headless-Chromium upkeep.
landscape (4 existing solutions)
Generic web archiving tools are getting outflanked by site-specific anti-archiving techniques (Reddit's lazy-loaded comments, Twitter's auth-walling, Substack's truncation). A self-hostable archiver with site-specific extractors is a legitimate product gap.