JobyBotsAI Job Hunter

Under the hood

The technology, in plain English.

Every library, every service, every reason it's here. No black boxes. No tracking pixels. No surprises.

~4 200

Lines of Python (the bot)

~18 000

Lines of TypeScript (the site)

0

Third-party servers your bot phones home to

The bot makes exactly one network call to jobybots.com per cycle (license check). It cannot — by design — upload your résumé or sent emails.

1. The bot on your machine

Everything that touches your résumé, your Gmail, and your data lives here. Single folder, no installer, no admin rights, no background services other than the ones you explicitly start.

  • Python 3.12 + virtualenv

    The whole bot is ~4,000 lines of Python. Open any file in Notepad / TextEdit and read it. No compiled binaries, no obfuscation.

  • SQLite (jobybot.db)

    Single file in data/jobybot.db holds every job found, every email sent, every bounce, every license check. Open it in DB Browser for SQLite to inspect. Never replicated to any server.

  • Pydantic Settings

    Reads .env at startup; validates types; refuses to run if a required field is missing. Catches typos in your config before they cost you a cycle.

  • Loguru

    Structured logging to data/jobybot.log. Last-run log shows you exactly what happened. Cap at 10MB so it can never fill your disk.

  • APScheduler

    Pure-Python cron-style scheduler. Runs the cycle every N minutes inside the bot process — no Windows Task Scheduler dependency for the main loop (we also register a daily 9am task as a belt-and-suspenders fallback).

  • Playwright (Easy Apply, optional)

    When ENABLE_EASY_APPLY=true, the bot drives a real Chromium window via Playwright (headed by default so you watch it). Only used for LinkedIn Easy Apply automation. ~150MB Chromium downloaded once. Off by default. See /easy-apply for the full algorithm.

  • stdlib http.server

    Review Queue web UI uses only Python's built-in HTTP server. No Flask, no FastAPI, no extra dependency surface. ~250 lines including the HTML.

  • smtplib (stdlib)

    Email sending uses Python's built-in SMTP client. Hardcoded to Gmail's STARTTLS endpoint. Cannot be reconfigured for a different provider.

2. Job discovery

How the bot finds the 250+ jobs/day you see in your dashboard. All sources are public; all of them respect rate-limits and robots.txt.

  • Playwright (LinkedIn)

    Logged-in session using YOUR li_at cookie. Reads public job search pages only — never auto-applies, never sends messages, never opens connection requests.

  • requests + BeautifulSoup

    Plain HTTPS GET requests with a real browser User-Agent for Indeed, Bayt, NaukriGulf, GulfTalent, and ~40 company career pages. Throttled to ~1 page / 2 seconds per source.

  • RemoteOK JSON feed

    Uses the official public feed at remoteok.com/api.json. No HTML scraping, no auth.

  • Per-country market plans

    config/markets/*.json holds curated career pages + recruiter email patterns for UAE, Saudi, Qatar, Oman, Bahrain, UK, India. Easy to add a country by dropping a JSON file.

3. AI scoring + cover letters

Optional. The bot works without AI; enabling it raises reply quality. We never call an AI without an explicit key in your .env.

  • Google Gemini Flash

    Primary model for AI match scoring (0–100) and tailored cover-letter drafting. ~$0/month on Google AI Studio's free tier for most usage. Falls back gracefully if quota exhausted.

  • Groq (optional)

    Optional Llama-3.3-70B fallback for if Gemini is down or you prefer it. Toggle via GROQ_API_KEY in .env.

  • Hand-written templates

    If no AI key is set, the bot uses curated jinja2 templates per category (PM, BA, etc.) with your résumé data merged in. Still personalised, just rule-based.

4. Email finding (the secret sauce)

Recruiters rarely list their email on a job post. The bot has a 5-tier waterfall to find a verified address — and falls silent if no tier returns one with high confidence.

  • Tier 0 — cache

    Already-verified emails from previous cycles are reused without re-checking.

  • Tier 1 — careers page scrape

    Visits company.com/careers and parses for mailto: links. Highest precision; fully ToS-safe.

  • Tier 2 — LinkedIn HR lookup

    Uses your li_at cookie to find recruiter profiles on the job post and resolve their email via Hunter-style domain pattern matching. Quota-capped at 30 lookups/day.

  • Tier 3 — country-aware patterns

    Lowest precision. Tries common GCC patterns (careers@, hr@, jobs@) but only forwards an address if Tier 4 verifies it.

  • Tier 4 — SMTP RCPT probe

    Speaks SMTP to the candidate domain's MX server, asks 'do you accept mail for <address>?', and drops anything that fails. Results cached so we don't re-probe.

5. Marketing site (jobybots.com)

Everything you see when you visit the website. Completely separate from the bot — the website never sees your data, the bot never connects to the website except for the once-per-cycle license check.

  • Next.js 15 (App Router)

    Server components for SEO, client components for interactivity. React 19. TypeScript strict mode.

  • Tailwind CSS

    Utility-first styling. Zero runtime CSS-in-JS overhead. ~30KB final stylesheet.

  • Framer Motion

    Used sparingly for the founder story and product tour. Respects prefers-reduced-motion.

  • Vercel KV (Redis)

    Stores customer accounts, payment orders, and machine-license bindings. Backed up nightly.

  • scrypt (node:crypto)

    Customer passwords stored with scrypt N=16384, r=8, p=1. Even with the KV dump, brute-force would cost ~$10k per password.

  • Nodemailer + Gmail SMTP

    Transactional emails (activation, rejection) use the same Gmail account the support team monitors. No SendGrid / Postmark / Mailgun — fewer vendors, smaller blast radius.

6. What we deliberately don't use

Every dependency is a future security headache. Here&apos;s what we cut.

  • No analytics SDK

    No Google Analytics, no Mixpanel, no PostHog, no Plausible. We use Vercel's built-in deploy logs and that's it. You're not being tracked.

  • No CDN third parties

    All fonts, icons, and scripts are self-hosted. No Google Fonts, no jsDelivr, no FontAwesome. Page works offline once loaded.

  • No payment processor on file

    UPI payments are settled out-of-band with a screenshot upload. No Stripe, no Razorpay, no card numbers ever touch our infrastructure.

  • No background services on your laptop

    Only what you explicitly start. No system tray icon. No 'JobyBots Helper' lurking in Activity Monitor. Close the terminal window and the bot is gone.

Read it yourself

The entire bot is on GitHub. Clone it, audit it, fork it, run your own.