Incident Report
Summary
Starting on 4 December 2024 at 14:30 US/Pacific, Google BigQuery experienced elevated invalid value and internal system errors globally for traffic related to BigQuery and Google Drive integration for 3 hours and 25 minutes. The incident affected users and tasks attempting to export data to Google Drive, resulting in failed export jobs.
Incident began at 2024-12-04 14:30 and ended at 2024-12-04 18:28 (all times are US/Pacific).
To our BigQuery customers whose business analytics were impacted during this disruption, we sincerely apologize. This is not the level of quality and reliability we strive to offer you, and we are taking immediate steps to improve the platform’s performance and availability.
The impacted users would have encountered “API key not valid” and “Failed to read the spreadsheet” errors for export jobs when accessing Google Drive. This resulted in service unavailability or failing jobs for the duration of this disruption for Google BigQuery.
- An internally used API key was flagged for a Google policy non-compliance and deemed no longer in use which led to the deleting of the API key.
- Unclear Internal Google Project Ownership: The project ownership was not clearly recorded, and thus incorrectly associated with a deprecated service.
- Outdated Information: The combination of the perceived lack of recent activity and the incorrect ownership led to the project being mistakenly classified as abandoned and deprecated.
Remediation and Prevention
Google engineers were alerted to the service degrading via a support case on 4 December 2024, at approximately 14:21 US/Pacific when users began experiencing failures in data export operations. Google Engineers were alerted to the service disruption through internal monitoring systems and user reports. Upon investigation, the deletion of the project was identified as the root cause.
Root Cause
This disruption of data export functionality was triggered by the deletion of an internal project containing essential API keys. This deletion was an unintended consequence of several contributing factors:
Google is committed to continually improving our technology and operations to prevent service disruptions. We apologize for any inconvenience this incident may have caused and appreciate your understanding.
To mitigate the impact, the project was restored at approximately 15:45 US/Pacific. This action successfully recovered the API keys and over time restored the data export functionality for all affected users. The final error related to this incident was observed at approximately 17:55 US/Pacific, indicating full service recovery.
Remove dependency on API keys for BigQuery integrations with other Google services: This will eliminate the entire failure mode.
Google is committed to preventing a repeat of this issue in the future and is completing the following actions.
Enhance Project Metadata: We are implementing a process for regular review and validation of project ownership and metadata. This will ensure that critical information about project usage and status is accurate and up-to-date, reducing the risk of incorrect assumptions about project status.
Implement accidental deletion protection for critical internal resources: Use mechanisms like project liens to ensure that a critical resource cannot be deleted accidentally.
Detailed Description of Impact
Starting on 4 December 2024, Google BigQuery experienced elevated error rates for data export operations to Google Drive globally. Between approximately 14:21 and 18:04 US/Pacific, users attempting to export data from BigQuery to Google Drive encountered failures, resulting in service disruption for this specific functionality.
Strengthen Internal Processes and Access Controls: We are strengthening our processes for deprecating and deleting projects, including mandatory reviews, impact assessments, and stakeholder approvals. This will prevent accidental deletion of critical projects and ensure that all potential impacts are thoroughly evaluated before any action is taken. We are also strengthening access controls for project deletion, ensuring that only authorized personnel with appropriate approvals can perform this action. This will add an additional layer of protection against unintended project deletion.
Google BigQuery
This disruption specifically impacted users and automated tasks relying on the BigQuery to Google Drive export functionality. Export jobs initiated during this period failed to complete, preventing data transfer and potentially impacting downstream processes and workflows dependent on this data.
The incident affected all regions and impacted users encountered errors such as “API key not valid,” “Failed to read the spreadsheet,” or “[Error: 80324028]”. Internal error messages further specified the issue as “Dremel returned third-party error from GDRIVE: FAILED_PRECONDITION: Encountered an error while creating temporary directory” with an underlying status of “Http(400) Bad Request, API key not valid. Please pass a valid API key.”
Affected locations: Johannesburg (africa-south1), Taiwan (asia-east1), Hong Kong (asia-east2), Tokyo (asia-northeast1), Osaka (asia-northeast2), Seoul (asia-northeast3), Mumbai (asia-south1), Delhi (asia-south2), Singapore (asia-southeast1), Jakarta (asia-southeast2), Sydney (australia-southeast1), Melbourne (australia-southeast2), Warsaw (europe-central2), Finland (europe-north1), Madrid (europe-southwest1), Berlin (europe-west10), Turin (europe-west12), London (europe-west2), Frankfurt (europe-west3), Netherlands (europe-west4), Zurich (europe-west6), Milan (europe-west8), Paris (europe-west9), Doha (me-central1), Dammam (me-central2), Tel Aviv (me-west1), Montréal (northamerica-northeast1), Toronto (northamerica-northeast2), São Paulo (southamerica-east1), Santiago (southamerica-west1), Iowa (us-central1), South Carolina (us-east1), Northern Virginia (us-east4), Columbus (us-east5), Dallas (us-south1), Oregon (us-west1), Los Angeles (us-west2), Salt Lake City (us-west3), Las Vegas (us-west4)
Affected products: Google BigQuery