Skip to main content
All CollectionsProduct UpdatesIncident Updates
Root Cause Analysis - 15/01/2025
Root Cause Analysis - 15/01/2025
K
Written by Kiren Dosanjh-Dixon
Updated this week

The root cause of the incident was due to an inefficient database query included in a deployment aimed at resolving issues with the display of all-day meetings. The query caused significant performance degradation shortly after release. The issue was identified and resolved within one hour through a rollback and code reversion.

Impact:

  • A database outage caused significant platform performance degradation, this was caused by a deployment that included an inefficient query for loading the schedule.

  • The issue affected the performance of the platform, leading to significant degradation shortly after the release.

  • The incident started at approximately 10:00 AM, with the deployment initiated at 09:50 AM. The issue was resolved by 11:03 AM.

Resolution Steps:

  1. Incident Detection: The performance issues were noticed around 10:00 AM, shortly after the deployment.

  2. Rollback: The deployment was immediately rolled back to the previous stable version.

  3. Code Reversion: The code changes introducing the inefficient query were reverted to prevent further impact.

  4. Service Recovery: The rollback successfully restored the platform's performance by 11:03 AM.

Mitigations:

Query Performance Auditing:

  • Implement additional processes to review high impact database queries, particularly those involved in critical user-facing functionalities.

  • Expand Development and QA Databases to introduce further replication of production data size.

Conclusion:

The database outage resulted from an inefficient query deployed to production that required a greater breadth of production-like conditions during Development and QA processes. A quick resolution occurred by rolling back the code to a previously stable version. Moving forward, enhancements to the QA process and deployment procedures will be prioritized to prevent recurrence.

Did this answer your question?