I built a small scraper that does one thing well. You pass URLs. It follows internal links and returns the emails it finds. Focus is speed and low noise.
Stack and guardrails: Crawlee + Cheerio. 15s timeout per page, 2 retries, cap at ~100 requests, deduped emails. Pulls from mailto and visible text. A typical site finishes in under 30s.
Output: JSON rows { url, email }. Export as CSV or pipe to your own thing.
Use it from code: API clients in JS and Python, OpenAPI, CLI, and an MCP endpoint. One token and a single call.
Pricing: pay per result. 5 dollars per 1,000 emails. You can try it free first.
What I want from HN: edge cases where it breaks, false positives you notice, limits that feel off. Sample sites welcome.