Overview
JobHunter runs on a DigitalOcean VPS and checks 40+ target companies for new job postings every 30 minutes between 6am–8pm PST. New listings are deduplicated against a local SQLite database and emailed immediately via the Resend API. A live dashboard updates with each run, showing per-company job counts and all tracked postings — with a built-in 30-minute delay so email always delivers first.
Key Features
- Multi-scraper architecture: Supports 8 scraper types (Greenhouse, Ashby, Workable, Eightfold, Phenom, Apple, Qualcomm, Uber) covering 40+ companies including Stripe, LangChain, ElevenLabs, and Applied Intuition — over 2,000 jobs seeded on the initial Workable rollout
- Early-exit pagination: Stops fetching mid-page as soon as a known job ID is encountered, cutting unnecessary network requests on repeat runs
- SQLite deduplication:
notifiedcolumn tracks email state — jobs that fail to send are automatically retried next run - Resend API email: HTML digest of new postings; avoids DigitalOcean’s SMTP block by using HTTPS
- Structured logging + per-run metrics: Logs duration, fetch count, new count, and errors per company to stdout and a persistent log file
- Live dashboard: Static HTML report regenerated each cron cycle, served via nginx — shows stat cards per company and a full job table
Architecture
cron (*/30 6-20 PST) → fetch_jobs() [early-exit pagination]
→ SQLite dedup → Resend email
→ generate_report() → nginx → live URL
Tech Stack
Python 3.13 · SQLAlchemy · SQLite · Resend API · nginx · DigitalOcean · pytest