December 17, 2024 | Down Log Status Tracker v0.23

Notion – Guests facing issues tagging members on pages

December 22, 2024December 17, 2024 by Down Log

Dec 17, 23:01 PST
Resolved – This incident has been resolved.

Dec 17, 01:25 PST
Investigating – We are currently investigating this issue

Snowflake – Azure – East US 2 (Virginia): INC0122743

December 22, 2024December 17, 2024 by Down Log

Dec 17, 17:13 PST
Resolved – Current status: We’ve implemented the fix for this issue and monitored the environment to confirm that service was restored. If you experience additional issues or have questions, please open a support case via Snowflake Community.

Customer experience: Customers hosted in the specified regions may intermittently experience delays or failures while attempting to use various Snowflake services and features.

Incident start time: 14:00 UTC December 17, 2024
Incident end time: 23:42 UTC December 17, 2024

Preliminary root cause: The current root cause investigation is focused on a recent load balancer change that manages transaction IDs in order to minimize conflicts within the metadata database.

A root cause analysis (RCA) document will be published within seven business days.

Dec 17, 16:31 PST
Monitoring – Current status: We’ve implemented the fix for this issue, and we’ll continue to monitor the environment until we’re confident all services are functioning properly.

Customer experience: Customers hosted in the specified regions may intermittently experience delays or failures while attempting to use various Snowflake services and features.

Incident start time: 14:00 UTC December 17, 2024
Incident end time: 23:42 UTC December 17, 2024

Preliminary root cause: The current root cause investigation is focused on a recent load balancer change that manages transaction IDs in order to minimize conflicts within the metadata database.

Dec 17, 15:43 PST
Update – Current status: We’re continuing to investigate the issue with Snowflake Data Cloud. The current investigation is focused on a recent load balancer change that manages transaction IDs in order to minimize conflicts within the metadata database. We’ll provide another update within 60 minutes.

Customer experience: Customers hosted in the specified regions may intermittently experience delays or failures while attempting to use various Snowflake services and features.

Incident start time: 14:00 UTC December 17, 2024

Dec 17, 14:42 PST
Update – Current status: We’re continuing to investigate the issue with Snowflake Data Cloud. We’ll provide another update within 60 minutes.

Customer experience: Customers hosted in the specified regions may intermittently experience delays or failures while attempting to use various Snowflake services and features.

Incident start time: 14:00 UTC December 17, 2024

Dec 17, 13:50 PST
Update – Current status: We’re continuing to investigate the issue with Snowflake Data Cloud. We’ll provide another update within 60 minutes.

Customer experience: Customers hosted in the specified regions may intermittently experience delays or failures while attempting to use various Snowflake services and features.

Incident start time: 14:00 UTC December 17, 2024

Dec 17, 13:03 PST
Investigating – Current status: We’re investigating an issue with Snowflake Data Cloud. We’ll provide an update within 60 minutes.

Customer experience: Customers hosted in the specified regions may intermittently experience delays or failures while attempting to use various Snowflake services and features.

Incident start time: 14:00 UTC December 17, 2024

OpenAI – Degraded Performance in ChatGPT Advanced Voice Mode

December 22, 2024December 17, 2024 by Down Log

Dec 17, 15:23 PST
Resolved – Between 12:37 PM and 03:17 PM PST some users experienced degraded performance on Advanced Voice Mode. The impact was that these conversations were not appearing in the conversation history. We have now mitigated the issue and recovered.

Dec 17, 14:44 PST
Update – We’re continuing to investigate the issue. Users can still have conversations in Advanced Voice Mode, but they will not appear in the conversation history when returning. We’ll provide updates as soon as we have more information.

Dec 17, 13:30 PST
Update – We are still investigating this issue.

Dec 17, 12:37 PST
Investigating – We are currently experiencing degraded performance of ChatGPT Advanced Voice. Some of the conversation transcripts might not show up in the conversation history.

Firebase – UPDATE: Firebase Console incorrectly lists custom domain status as “pending” even after completion

December 17, 2024 by Down Log

Incident began at 2024-11-21 00:00 (all times are US/Pacific).

A fix for the issue is rolling out now, and should be live in all regions by the end of the week.

Affected products: App Hosting

Firebase – UPDATE: Firebase Console incorrectly lists custom domain status as “pending” even after completion

December 17, 2024 by Down Log

Incident began at 2024-11-21 00:00 (all times are US/Pacific).

The App Hosting console is reporting that custom domains are in a pending state, even though they’re not. This is caused by a communication failure inside our systems that we’re working to fix. The information in the console will be correct as soon as we can correct it. The domains are working fine; this should have no impact on end users.

Affected products: App Hosting

Anthropic – Unauthorized post from @AnthropicAI X.com account

December 22, 2024December 17, 2024 by Down Log

Dec 17, 13:16 PST
Resolved – We have identified and addressed the root cause of unauthorized posts on @AnthropicAI, our official X account. No Anthropic systems or services were affected in this incident.

Dec 17, 09:08 PST
Monitoring – We have regained secure access to the @AnthropicAI account and will be continuing our investigation, with the support of X, into how these unauthorized posts were made.

Dec 17, 08:43 PST
Identified – We are aware of a second unauthorized post on @AnthropicAI on X.com, and are continuing to work to regain access to the impacted account. There are no impacts to other Anthropic services.

Dec 17, 08:04 PST
Investigating – We are aware of an unauthorized post originating from our official X.com account, @AnthropicAI. At this time the post has been removed, and we are investigating the issue.

Zapier – Stripe Auth Expiration and Reconnection Errors

December 22, 2024December 17, 2024 by Down Log

Dec 17, 11:38 PST
Resolved – This incident has been resolved.

On Dec 16th at 4:27 PM UTC, Stripe connections began failing, and attempts to reconnect returned the following error:

‘Invalid auth connection’

Our team implemented a fix at 5:30 PM UTC on the same day.

Due to these authentication errors, some Zaps using a Stripe trigger were turned off. As of 9:30 PM UTC on Dec 16th, our team unpaused these Zaps. During the downtime, we continued to receive hooks for paused Zaps, and any failed Zap runs were replayed by our team as of Dec 17th at 6:01 PM UTC. No data was lost.

We appreciate your patience during this incident and sincerely apologize for any inconvenience caused. If you have any further questions, please don’t hesitate to reach out to our support team here: https://zapier.com/app/get-help.

Dec 16, 10:48 PST
Monitoring – We are currently looking into an issue where users experienced errors when trying to establish a connection with Stripe. The specific error message was ‘Invalid auth connection.’

We are pleased to report that we have implemented a fix and users should now be able to reconnect/enable any Zaps using Stripe without encountering the earlier problem.

If any further issues arise or you do have questions, please do not hesitate to contact our dedicated Support Team, which can be reached via this link: https://zapier.com/app/get-help

We do apologize for any inconvenience that this may have caused and appreciate your understanding as we worked to resolve the issue. We will continue to monitor the situation closely.

Dec 16, 09:00 PST
Investigating – We’re currently investigating an issue where Stripe auths are expiring, and attempts to reconnect return the following error:

‘Invalid auth connection.’

We’ll update this page with more information as it becomes available. If you have any questions, please contact our support team at https://zapier.com/app/get-help.

Bubble – Issues with Main Bubble Cluster

December 22, 2024December 17, 2024 by Down Log

Dec 17, 13:09 EST
Resolved – Our systems are functional and we are closing out this incident.

Dec 17, 12:53 EST
Investigating – We are investigating reports of issues with our systems.

Cloudflare – LAX (Los Angeles) on 2024-12-17

December 17, 2024 by Down Log

Dec 13, 21:44 UTC
Update – We will be performing scheduled maintenance in LAX (Los Angeles) datacenter on 2024-12-17 between 17:00 and 22:00 UTC.

Traffic might be re-routed from this location, hence there is a possibility of a slight increase in latency during this maintenance window for end-users in the affected region. For PNI / CNI customers connecting with us in this location, please make sure you are expecting this traffic to fail over elsewhere during this maintenance window as network interfaces in this datacentre may become temporarily unavailable.

You can now subscribe to these notifications via Cloudflare dashboard and receive these updates directly via email, PagerDuty and webhooks (based on your plan): https://developers.cloudflare.com/notifications/notification-available/#cloudflare-status.

THIS IS A SCHEDULED EVENT Dec 17, 17:00 – 22:00 UTC

Dec 10, 21:48 UTC
Scheduled – We will be performing scheduled maintenance in LAX (Los Angeles) datacenter on 2024-12-17 between 09:00 and 12:00 UTC.

GitHub – Live updates on pages not loading reliably

December 22, 2024December 17, 2024 by Down Log

Dec 17, 16:00 UTC
Resolved – On December 17th, 2024, between 14:33 UTC and 14:50 UTC, many users experienced intermittent errors and timeouts when accessing github.com. The error rate was 8.5% on average and peaked at 44.3% of requests. The increased error rate caused a broad impact across our services, such as the inability to log in, view a repository, open a pull request, and comment on issues. The errors were caused by our web servers being overloaded as a result of planned maintenance that unintentionally caused our live updates service to fail to start. As a result of the live updates service being down, clients reconnected aggressively and overloaded our servers.

We only marked Issues as affected during this incident despite the broad impact. This oversight was due to a gap in our alerting while our web servers were overloaded. The engineering team’s focus on restoring functionality led us to not identify the broad scope of the impact to customers until the incident had already been mitigated.

We mitigated the incident by rolling back the changes from the planned maintenance to the live updates service and scaling up the service to handle the influx of traffic from WebSocket clients.

We are working to reduce the impact of the live updates service’s availability on github.com to prevent issues like this one in the future. We are also working to improve our alerting to better detect the scope of impact from incidents like this.

Dec 17, 15:32 UTC
Update – Issues is operating normally.

Dec 17, 15:29 UTC
Update – We have taken some mitigation steps and are continuing to investigate the issue. There was a period of wider impact on many GitHub services such as user logins and page loads which should now be mitigated.

Dec 17, 15:05 UTC
Update – Issues is experiencing degraded availability. We are continuing to investigate.

Dec 17, 14:53 UTC
Update – We are currently seeing live updates on some pages not working. This can impact features such as status checks and the merge button for PRs.

Current mitigation is to refresh pages manually to see latest details.

We are working to mitigate this and will continue to provide updates as the team makes progress.

Dec 17, 14:51 UTC
Investigating – We are investigating reports of degraded performance for Issues