Dec 4, 19:27 UTC
Resolved – On December 4th, 2024 between 18:52 UTC and 19:11 UTC, several GitHub services were degraded with an average error rate of 8%.
The incident was caused by a change to a centralized authorization service that contained an unoptimized database query. This led to an increase in overall load on a shared database cluster, resulting in a cascading effect on multiple services and specifically affecting repository access authorization checks. We mitigated the incident after rolling back the change at 19:07 UTC, fully recovering within 4 minutes.
While this incident was caught and remedied quickly, we are implementing process improvements around recognizing and reducing risk of changes involving high volume authorization checks. We are investing in broad improvements to our safe rollout process, such as improving early detection mechanisms.
Dec 4, 19:26 UTC
Update – Pull Requests is operating normally.
Dec 4, 19:21 UTC
Update – Pull Requests is experiencing degraded performance. We are continuing to investigate.
Dec 4, 19:20 UTC
Update – Issues is operating normally.
Dec 4, 19:18 UTC
Update – API Requests is operating normally.
Dec 4, 19:17 UTC
Update – Webhooks is operating normally.
Dec 4, 19:11 UTC
Update – We have identified the cause of timeouts impacting users across multiple services. This change was rolled back and we are seeing recovery. We will continue to monitor for complete recovery.
Dec 4, 19:07 UTC
Update – Issues is experiencing degraded performance. We are continuing to investigate.
Dec 4, 19:05 UTC
Update – API Requests is experiencing degraded performance. We are continuing to investigate.
Dec 4, 19:05 UTC
Update – Webhooks is experiencing degraded performance. We are continuing to investigate.
Dec 4, 18:58 UTC
Investigating – We are currently investigating this issue.