Back to overview
Downtime

Nango is experiencing problems.

Oct 06 at 07:22pm CEST
Affected services
Nango Cloud Health

Resolved
Oct 07 at 11:02am CEST

Post-Incident Summary — Database Lock Contention and Service Outage

Date: 6 October 2025
Impact: Temporary outage of Nango services
Status: Resolved

Summary
sustained, high-volume workload from a downstream integration generated a large number of webhook events over an extended period. The follow-on fetches (getRecords) executed concurrently at scale, and a small update in that code path (persisting a “last fetched” timestamp) created heavy database contention. Lock saturation and connection exhaustion led to elevated latencies and 499 responses across public APIs until mitigations were applied.

Timeline (CET)
- Issue began: 19:21
- Detected by monitoring: 19:37
- Mitigated: 20:35
- Fully resolved: 23:17

Root Cause
The getRecords flow includes a database write to persist a “last fetched” timestamp. Under extreme concurrency, these updates serialized and waited on one another, creating widespread lock contention. As locks piled up, available connections were exhausted, which introduced long response times and impacted all Nango services.

Resolution
- Logs and metrics showed elevated lock waits and deadlock messages (over 18,000 waiting operations at peak).
- High-load webhook sources were temporarily disabled to shed database work, after which services recovered to normal.
- The “last fetched” update in getRecords was determined to be unnecessary and removed to eliminate this contention pattern.

All systems were fully operational by 20:35 CET.

Follow-Up Actions
- Rate limiting on webhook endpoints.
- More controls to limit the blast-radius of isolated single-tenant spikes.
- Enhanced observability for lock growth and connection-pool saturation to enable earlier, automated mitigation.

Updated
Oct 06 at 09:13pm CEST

Nango services are up and running again. We will follow up with a postmortem here.

Updated
Oct 06 at 08:35pm CEST

It looks like we are back but still monitoring the situation and will provide further updates.

Updated
Oct 06 at 08:19pm CEST

We have a lead on the route cause and working on restoring services asap.

Created
Oct 06 at 07:22pm CEST

Nango is experiencing problems and we are investigating.