Intermitent failures on Git via SSH and HTTPS
Incident Report for Atlassian Bitbucket
Postmortem

SUMMARY

On June 2, 2023, between 00:07 UTC and 00:49 UTC, and again on June 6, 2023, between 06:25 UTC and 08:19 UTC, Atlassian customers using Bitbucket Cloud faced significant degradation to its website and Git functionality. This was caused by an issue with our primary database, which impaired the operation of our core services until the database was able to recover. The incidents were detected immediately by our internal monitoring and fully resolved within one hour for the initial incident and two hours for the second incident.

IMPACT

Customers who were impacted were unable to access the bitbucket.org website, APIs, Pipelines, and Git over HTTPS and SSH during the following times:

  • All Bitbucket services were impacted on June 2, 2023 between 00:07 UTC and 00:49 UTC. Website and API requests recovered after 7 minutes, while Git operations were not fully operational until 00:49 UTC.
  • All Bitbucket services were impacted on June 6, 2023 between 06:25 UTC and 08:19 UTC. Git operations recovered at 07:41 UTC. Website and API traffic recovered briefly from 07:16 UTC to 08:06 UTC but experienced additional instability from 08:07 UTC to 08:19 UTC.

ROOT CAUSE

Both incidents were triggered by database cluster restarts. The June 2, 2023 incident was caused by a bug in the version of our database engine. The June 6, 2023 incident was caused by a maintenance patch being applied to all nodes in our database cluster.

REMEDIAL ACTIONS PLAN & NEXT STEPS

We know that outages impact your productivity. We are prioritizing the following improvement actions to avoid repeating these types of incidents in the future and reduce recovery time:

  • Enhance safeguards around patch application on the database
  • Improve how our services recover from database failures

We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s availability.

Thanks,

Atlassian Customer Support

Posted Jun 15, 2023 - 05:15 UTC

Resolved
Incident has been resolved.
Posted Jun 02, 2023 - 04:02 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jun 02, 2023 - 00:58 UTC
Update
We are continuing to investigate this issue.
Posted Jun 02, 2023 - 00:47 UTC
Investigating
We are aware of intermitent failures on Git via SSH and HTTPS
Posted Jun 02, 2023 - 00:46 UTC
This incident affected: Git via SSH, Git via HTTPS, Pipelines, and Git LFS.