How GitHub Leverages eBPF for Safer Deployments

By ✦ min read

GitHub faces a unique challenge: it hosts its own source code on github.com, creating a circular dependency. If the site goes down, teams cannot access the code needed to fix it. To address this, GitHub explored using eBPF to monitor and block problematic calls during deployments. Below, we answer key questions about this approach and the types of circular dependencies involved.

1. Why does GitHub face a circular dependency when deploying its own platform?

GitHub is its own biggest customer, meaning it runs github.com using the same code it develops. When a deployment needs to happen—for example, to fix a MySQL outage—the deployment scripts must often pull assets or binaries from github.com. If github.com is down (because of the same outage), the scripts cannot complete. This creates a circular loop: to fix GitHub, you need GitHub. To mitigate this, GitHub maintains a code mirror and built assets for rollbacks. However, deeper circular dependencies can still occur within deployment scripts themselves, which is why they turned to eBPF for additional safety.

How GitHub Leverages eBPF for Safer Deployments
Source: github.blog

2. What are the different types of circular dependencies in deployment scripts?

GitHub identified three main types: direct, hidden, and transient circular dependencies. A direct dependency occurs when a deployment script explicitly tries to download a tool from GitHub during an outage. A hidden dependency happens when a script uses a local tool that silently checks for updates from GitHub, causing failure or hang when unreachable. A transient dependency arises when a script calls another internal service, which in turn tries to fetch something from GitHub. Each type can prevent deployments or complicate incident response, especially during outages that affect GitHub itself.

3. How does a direct circular dependency manifest during a MySQL outage scenario?

Imagine a MySQL outage prevents GitHub from serving release data from repositories. To resolve the incident, an operator needs to run a deploy script on each affected MySQL node. That script might attempt to pull the latest release of an open source tool directly from GitHub. Since GitHub is unable to serve the release data (due to the same outage), the script cannot complete. This is a clear, direct circular dependency: the fix requires a resource that is unavailable because of the problem being fixed. Such dependencies are easy to spot but can be missed if the script’s actions are not thoroughly reviewed.

4. What is a hidden circular dependency and how can it cause deployment failures?

A hidden circular dependency is one that isn’t obvious from the script’s primary logic. For example, a MySQL deploy script might use a servicing tool already on the machine’s disk. However, when that tool runs, it may check GitHub to see if an update is available. If GitHub is unreachable due to an outage, the tool might fail with an error, hang while waiting for a timeout, or refuse to proceed. This behavior is not documented in the script itself, making it hard to anticipate. Such dependencies mean that even local tools can inadvertently create circular dependencies on GitHub’s availability.

How GitHub Leverages eBPF for Safer Deployments
Source: github.blog

5. Can you give an example of a transient circular dependency?

Consider a MySQL deploy script that calls, via an API, another internal service (like a migrations service). That internal service, in turn, attempts to fetch the latest release of an open source tool from GitHub to use a new binary. If GitHub is down, the migration service fails, and that failure propagates back to the deploy script. This is a transient dependency because the script itself doesn’t directly depend on GitHub—the dependency is introduced through an intermediate service. Such dependencies are particularly insidious because they involve multiple components and can be triggered only under specific failure conditions.

6. What was the traditional approach to handling circular dependencies at GitHub?

Until recently, the responsibility fell on each team that owns stateful hosts to manually review their deployment scripts and identify any circular dependencies. This involved reading through scripts, understanding all external calls, and ensuring that those calls don’t rely on GitHub being available. In practice, this was error-prone because hidden and transient dependencies are easy to overlook. Teams might not know that a local tool checks for updates, or that an internal API calls GitHub. The process relied heavily on human vigilance and often missed edge cases, leading to deployment failures during incidents.

7. How does eBPF help GitHub improve deployment safety?

When designing a new host-based deployment system, GitHub evaluated eBPF to selectively monitor and block circular dependency calls during deployments. eBPF allows them to trace system calls—like network requests to github.com—made by deployment scripts and their child processes. If a script attempts a blocked call (e.g., downloading from GitHub), eBPF can intercept and deny it, preventing the circular dependency from completing. This enforcement happens at the kernel level, so even hidden or transient dependencies are caught without modifying the scripts themselves. By using eBPF, GitHub adds a safety net that automatically enforces deployment policies, reducing reliance on manual review and making deployments more resilient during outages.

Tags:

Recommended

Discover More

Mastering GitHub Copilot CLI: A Guide to Interactive and Non-Interactive ModesBuilding VR Apps with React Native on Meta Quest: Your Questions AnsweredGooglebook: The Android-Powered Successor to Chromebook Embraces AI with GeminiWhy NASA's Science Missions Are Declining Despite Cheaper Access to SpaceClosing the GenAI Gender Gap: Insights from Coursera’s Latest Report