Detecting BGP Route Leaks with Python and RPKI

Why Route Leaks Still Happen

Despite years of best practices documentation, BGP route leaks remain one of the most common causes of large-scale internet outages. A single misconfigured peer can advertise routes it should not, redirecting traffic through unintended paths or causing reachability failures across entire regions.

The problem is that BGP was designed around trust. When your peer announces a prefix, your router accepts it if it matches your inbound policy. If your policy is too permissive — or if you forgot to apply one — leaked routes slip through.

The Detection Approach

Rather than waiting for NOC tickets, we can build proactive monitoring. The architecture looks like this:

The approach combines two data sources:

BGP stream data — real-time route announcements from public route collectors
RPKI validation — checking whether announcements match the signed ROA (Route Origin Authorization) records

from pybgpstream import BGPStream
import subprocess
import json

def validate_rpki(prefix, origin_asn):
    """Check if a BGP announcement is RPKI-valid."""
    result = subprocess.run(
        ["rpki-client", "-j", "-n", prefix],
        capture_output=True, text=True
    )
    if result.returncode != 0:
        return "not-found"

    data = json.loads(result.stdout)
    for roa in data.get("roas", []):
        if roa["asn"] == origin_asn and roa["maxLength"] >= int(prefix.split("/")[1]):
            return "valid"
    return "invalid"

stream = BGPStream(
    project="ris-live",
    record_type="updates",
    filter="prefix more 10.0.0.0/8"
)

for rec in stream.records():
    for elem in rec:
        if elem.type == "A":
            prefix = elem.fields["prefix"]
            as_path = elem.fields["as-path"].split()
            origin = int(as_path[-1])
            status = validate_rpki(prefix, origin)
            if status == "invalid":
                print(f"RPKI INVALID: {prefix} from AS{origin}")

Filtering the Noise

Raw BGP stream data is noisy. A busy route collector sees millions of updates per hour. We need to filter down to what matters:

Your prefixes — announcements for prefixes you originate or your customers originate
Your upstream paths — routes that should only come from specific transit providers
RPKI-invalid origins — any announcement where the origin ASN does not match the ROA

The key insight is to maintain a baseline of expected announcements and alert on deviations. This is where Python shines — you can build a stateful monitor that tracks the current RIB and flags changes.

Integrating with Alerting

Once you have a detection pipeline, the next step is alerting. We push alerts to both Slack and PagerDuty depending on severity:

def classify_severity(prefix, origin_asn, rpki_status):
    """Determine alert severity based on the leak characteristics."""
    prefix_len = int(prefix.split("/")[1])

    if rpki_status == "invalid" and prefix_len <= 16:
        return "critical"  # Large prefix, RPKI invalid
    elif rpki_status == "invalid":
        return "warning"   # Smaller prefix, still invalid
    elif origin_asn not in EXPECTED_ORIGINS.get(prefix, set()):
        return "info"      # Unexpected origin but RPKI valid
    return None

Deployment Considerations

We run this as a systemd service on a small VM colocated with our route servers. A few things we learned:

Memory matters — pyBGPStream can consume significant memory if you are tracking a full table. Filter early and aggressively.
Rate limit alerts — BGP convergence events can trigger hundreds of updates in seconds. Debounce your alerting with a 30-second window.
Log everything — even if you do not alert on it. Historical BGP data is invaluable for post-incident analysis.

The total cost is one small VM and a few hours of Python. Compared to the potential impact of an undetected route leak, it is a worthwhile investment.

What This Does Not Catch

This approach has limitations. It relies on public route collector visibility, which means leaks that do not propagate to a collector go undetected. It also cannot detect leaks where the origin ASN is correct but the AS path is manipulated — that requires path validation techniques like ASPA or BGPsec.

For most operational teams, RPKI validation plus origin monitoring covers the high-impact scenarios. Start there and add complexity as needed.