Nango is experiencing problems.
Resolved
Oct 07 at 11:02am CEST
Post-Incident Summary — Database Lock Contention and Service Outage
Date: 6 October 2025
Impact: Temporary outage of Nango services
Status: Resolved
Summary
A sustained, high-volume workload from a downstream integration generated a large number of webhook events over an extended period. The follow-on fetches (getRecords) executed concurrently at scale, and a small update in that code path (persisting a “last fetched” timestamp) created heavy database contention. Lock saturation and connection exhaustion led to elevated latencies and 499 responses across public APIs until mitigations were applied.
Timeline (CET)
- Issue began: 19:21
- Detected by monitoring: 19:37
- Mitigated: 20:35
- Fully resolved: 23:17
Root Cause
The getRecords flow includes a database write to persist a “last fetched” timestamp. Under extreme concurrency, these updates serialized and waited on one another, creating widespread lock contention. As locks piled up, available connections were exhausted, which introduced long response times and impacted all Nango services.
Resolution
- Logs and metrics showed elevated lock waits and deadlock messages (over 18,000 waiting operations at peak).
- High-load webhook sources were temporarily disabled to shed database work, after which services recovered to normal.
- The “last fetched” update in getRecords was determined to be unnecessary and removed to eliminate this contention pattern.
All systems were fully operational by 20:35 CET.
Follow-Up Actions
- Rate limiting on webhook endpoints.
- More controls to limit the blast-radius of isolated single-tenant spikes.
- Enhanced observability for lock growth and connection-pool saturation to enable earlier, automated mitigation.
Affected services
Updated
Oct 06 at 09:13pm CEST
Nango services are up and running again. We will follow up with a postmortem here.
Affected services
Updated
Oct 06 at 08:35pm CEST
It looks like we are back but still monitoring the situation and will provide further updates.
Affected services
Updated
Oct 06 at 08:19pm CEST
We have a lead on the route cause and working on restoring services asap.
Affected services
Created
Oct 06 at 07:22pm CEST
Nango is experiencing problems and we are investigating.
Affected services