SSH and HTTPS Git commands impaired, clones and git pull failing
Incident Report for Atlassian Bitbucket
Postmortem

SUMMARY

On June 2, 2023, between 00:07 UTC and 00:49 UTC, and again on June 6, 2023, between 06:25 UTC and 08:19 UTC, Atlassian customers using Bitbucket Cloud faced significant degradation to its website and Git functionality. This was caused by an issue with our primary database, which impaired the operation of our core services until the database was able to recover. The incidents were detected immediately by our internal monitoring and fully resolved within one hour for the initial incident and two hours for the second incident.

IMPACT

Customers who were impacted were unable to access the bitbucket.org website, APIs, Pipelines, and Git over HTTPS and SSH during the following times:

  • All Bitbucket services were impacted on June 2, 2023 between 00:07 UTC and 00:49 UTC. Website and API requests recovered after 7 minutes, while Git operations were not fully operational until 00:49 UTC.
  • All Bitbucket services were impacted on June 6, 2023 between 06:25 UTC and 08:19 UTC. Git operations recovered at 07:41 UTC. Website and API traffic recovered briefly from 07:16 UTC to 08:06 UTC but experienced additional instability from 08:07 UTC to 08:19 UTC.

ROOT CAUSE

Both incidents were triggered by database cluster restarts. The June 2, 2023 incident was caused by a bug in the version of our database engine. The June 6, 2023 incident was caused by a maintenance patch being applied to all nodes in our database cluster.

REMEDIAL ACTIONS PLAN & NEXT STEPS

We know that outages impact your productivity. We are prioritizing the following improvement actions to avoid repeating these types of incidents in the future and reduce recovery time:

  • Enhance safeguards around patch application on the database
  • Improve how our services recover from database failures

We apologize to customers whose services were impacted during this incident; we are taking immediate steps to improve the platform’s availability.

Thanks,

Atlassian Customer Support

Posted Jun 15, 2023 - 05:15 UTC

Resolved
All Bitbucket Cloud services have been restored, the team is continuing to investigate the cause of the error.
Posted Jun 06, 2023 - 08:00 UTC
Monitoring
An issue has been identified and a fix has been implemented, we're monitoring the recovery.
Posted Jun 06, 2023 - 07:38 UTC
Update
We are continuing to investigate this issue.
Posted Jun 06, 2023 - 07:05 UTC
Update
We are continuing to investigate this issue.
Posted Jun 06, 2023 - 07:04 UTC
Update
We are continuing to investigate this issue.
Posted Jun 06, 2023 - 06:56 UTC
Investigating
An increased error rate in git clones has been observed, the team is currently investigating.
Posted Jun 06, 2023 - 06:48 UTC
This incident affected: Git via SSH and Git via HTTPS.