bug bounty

Bug Bounty Automation with Python: The Secrets of Bug Hunting

Master bug bounty automation with Python: build recon pipelines, automate scanning, and discover the secrets pro hunters use to find bugs faster.

ChrisMay 25, 2026Updated Jul 1, 2026

6 min read46 reads

Bug Bounty Automation with Python: The Secrets of Bug Hunting

Bug bounty hunting rewards two things: coverage and speed. The hunters at the top of leaderboards aren't necessarily better at finding vulnerabilities than everyone else they just look at more targets, faster, with less manual effort. That's where Python comes in. It's the duct tape of the bug bounty world: glue together reconnaissance tools, parse their output, filter the noise, and surface the signals that actually lead to payouts.

This guide walks through how to build a real automation pipeline in Python, the patterns experienced hunters use, and the mistakes that keep beginners stuck at low-severity reports.

Why Python Dominates Bug Bounty Automation

Most public bug bounty toolssubfinder, httpx, nuclei, ffuf, gauare written in Go for speed. So why Python on top of them?

Because the work that wins bounties isn't running tools. It's chaining them, deduplicating their output, enriching results with context, and noticing the one weird response in 50,000 that nobody else looked at. Python's requests, asyncio, httpx, and BeautifulSoup libraries make that orchestration trivial, and its ecosystem (pandas, sqlite3, regex) is unmatched for slicing through messy data.

The mental model: Go tools do the heavy lifting, Python does the thinking.

The Core Bug Bounty Automation Pipeline

Every serious automation setup follows roughly the same five stages:

Asset discovery find subdomains, IPs, and endpoints owned by the target
Liveness probing figure out what's actually reachable
Fingerprinting identify tech stacks, frameworks, and versions
Vulnerability scanning run templated checks against the live attack surface
Notification and triage get alerted only when something interesting appears

Let's build a minimal version of this in Python.

Stage 1: Subdomain Enumeration

Python

import subprocess
import json

def enumerate_subdomains(domain: str) -> set[str]:
    sources = [
        ["subfinder", "-d", domain, "-silent"],
        ["assetfinder", "--subs-only", domain],
    ]
    subdomains = set()
    for cmd in sources:
        try:
            result = subprocess.run(cmd, capture_output=True, text=True, timeout=300)
            subdomains.update(line.strip() for line in result.stdout.splitlines() if line.strip())
        except (subprocess.TimeoutExpired, FileNotFoundError) as e:
            print(f"[!] {cmd[0]} failed: {e}")
    return subdomains

The secret here isn't the code it's running this on a schedule. New subdomains appear constantly. A cron job that diffs today's results against yesterday's surfaces fresh attack surface before anyone else sees it.

Stage 2: Async Liveness Probing

This is where Python's asyncio saves hours. Instead of probing 10,000 hosts one at a time, you fire them off in parallel:

Python

import asyncio
import httpx

async def probe(client: httpx.AsyncClient, url: str) -> dict | None:
    try:
        r = await client.get(url, timeout=10, follow_redirects=True)
        return {
            "url": str(r.url),
            "status": r.status_code,
            "title": extract_title(r.text),
            "server": r.headers.get("server", ""),
            "length": len(r.content),
        }
    except Exception:
        return None

async def probe_all(hosts: list[str]) -> list[dict]:
    limits = httpx.Limits(max_connections=50)
    async with httpx.AsyncClient(limits=limits, verify=False) as client:
        tasks = [probe(client, f"https://{h}") for h in hosts]
        results = await asyncio.gather(*tasks)
    return [r for r in results if r]

Fifty concurrent connections will chew through 10,000 hosts in a few minutes. Stay polite don't hammer single targets with hundreds of threads.

Stage 3: Smart Diffing The Real Secret

Most beginners run their pipeline, look at the results, and feel overwhelmed. Pros never look at full output. They look at diffs.

Python

import sqlite3
from datetime import datetime

def store_and_diff(results: list[dict], db_path: str = "recon.db") -> list[dict]:
    conn = sqlite3.connect(db_path)
    conn.execute("""CREATE TABLE IF NOT EXISTS hosts (
        url TEXT PRIMARY KEY, status INT, title TEXT,
        server TEXT, length INT, first_seen TEXT, last_seen TEXT
    )""")
    new_findings = []
    now = datetime.utcnow().isoformat()
    for r in results:
        existing = conn.execute("SELECT * FROM hosts WHERE url=?", (r["url"],)).fetchone()
        if not existing:
            new_findings.append(r)
            conn.execute("""INSERT INTO hosts VALUES (?,?,?,?,?,?,?)""",
                (r["url"], r["status"], r["title"], r["server"], r["length"], now, now))
        else:
            # Detect meaningful changes: status code, response size delta, title change
            if existing[1] != r["status"] or abs(existing[4] - r["length"]) > 500:
                new_findings.append({**r, "_changed": True})
            conn.execute("UPDATE hosts SET last_seen=? WHERE url=?", (now, r["url"]))
    conn.commit()
    return new_findings

This is the single biggest force multiplier in bug bounty automation. You're not hunting for vulnerabilities you're hunting for changes that signal new vulnerabilities. A new subdomain, a status code flip from 403 to 200, a response length jump of 50KB these are the signals worth investigating.

Stage 4: Targeted Vulnerability Scanning

Don't blast every host with every check. Use the fingerprinting data to scan smartly:

Python

def select_nuclei_templates(host: dict) -> list[str]:
    templates = ["cves/", "exposures/"]
    server = host.get("server", "").lower()
    if "nginx" in server: templates.append("technologies/nginx/")
    if "apache" in server: templates.append("technologies/apache/")
    if host.get("status") == 401: templates.append("default-logins/")
    return templates

Running 5,000 targeted templates beats running 50,000 generic ones both in signal quality and in how quickly you get results.

Stage 5: Push Notifications

Automation is worthless if you have to check it manually. Wire it to Discord, Telegram, or Slack:

Python

import httpx

def notify(webhook: str, findings: list[dict]):
    if not findings: return
    msg = f"🎯 {len(findings)} new findings:\n" + "\n".join(
        f"• [{f['status']}] {f['url']}" for f in findings[:10]
    )
    httpx.post(webhook, json={"content": msg})

The dream setup: you wake up, check your phone, see three new endpoints flagged overnight, and go investigate them with coffee in hand.

Secrets That Separate Top Hunters from the Rest

Focus on scope freshness, not scope size. Hunters chasing the same 500 huge targets get scraps. Set up monitoring for newly added programs on HackerOne, Bugcrowd, and Intigriti the first 48 hours after a scope expansion are gold.

Build a personal wordlist. Generic wordlists like seclists are everyone's wordlist. Mine your own findings every interesting endpoint, parameter name, and JS variable you discover goes into a custom list. Over time it becomes your unfair advantage.

Watch JavaScript files, not HTML. Modern apps leak endpoints, API keys, and internal hostnames in their bundled JS. Automate fetching .js files, regex out URLs and secrets, and diff them weekly.

Python

import re

JS_URL_REGEX = re.compile(r'["\'](/[a-zA-Z0-9_\-/]+(?:\?[a-zA-Z0-9_=&\-]*)?)["\']')
SECRET_PATTERNS = {
    "aws_key": re.compile(r"AKIA[0-9A-Z]{16}"),
    "google_api": re.compile(r"AIza[0-9A-Za-z\-_]{35}"),
    "jwt": re.compile(r"eyJ[A-Za-z0-9_\-]+\.[A-Za-z0-9_\-]+\.[A-Za-z0-9_\-]+"),
}

Respect the rules. Aggressive automation gets you banned. Rate-limit yourself, honor robots.txt boundaries when the program asks, and never automate against out-of-scope assets one mistake and you lose access to programs that took months to build reputation in.

Architecture Tips for Long-Term Hunting

Run your pipeline on a cheap VPS, not your laptop. Continuous recon needs to run 24/7 across thousands of targets without you babysitting it. A $5/month VPS with a cron job that runs every 6 hours pays for itself the first month you find a medium-severity bug.

Store everything in SQLite or PostgreSQL. Files get unmanageable past a few thousand hosts. A database lets you query things like "show me every host where the title changed in the last 7 days and the server is nginx" the kinds of questions that lead to real findings.

Modularize ruthlessly. One script per stage, communicating through the database or JSON files. When subfinder releases a new version or you want to swap in amass, you change one module, not your entire pipeline.

Where to Go From Here

Start small. Pick one target, build the five-stage pipeline above, and run it for two weeks. You'll quickly see what your bottleneck actually is usually it's noise filtering, not tool capability. Iterate from there.

The hunters making six figures aren't using secret tools. They're using the same subfinder andnuclei you have access to right now but wrapped in Python automation that runs while they sleep, surfaces only changes, and lets them spend their actual hunting time on the 1% of results worth investigating.

That's the real secret: bug bounty automation isn't about finding vulnerabilities automatically. It's about automating everything except the part where you find vulnerabilities.

Did you enjoy this article?

Written by

Chris

Tech builder · Agentic AI & offensive security

A tech-obsessed builder, I'm building Sentinelle — an autonomous offensive-security AI agent. I write here about agentic AI, AI-assisted pentesting, and what I learn shipping offensive tooling.

@T_temery