Oct 20: Major outage
Date of Incident: 20 October 2025
Resolution Time: 5 hours
Service Affected: The whole Zipchat platform
1. What happened
We experienced an unexpected, large-scale infrastructure outage impacting our service (and many others across the internet) due to an upstream failure at Amazon Web Services (AWS). Because of that outage, our live chat widget could not process requests reliably. To avoid causing problems on your site (such as slow-loading or hanging pages), we proactively paused key parts of the widget. Your website’s core functionality remained unaffected — the issue was isolated to the widget service.
2. Timeline
16:30 CET – We began noticing elevated error rates/time-outs in the widget service.
16:45 CET – Confirmed that the upstream AWS region (US-EAST-1) was experiencing broad instability.
21:00 – AWS reported the issue broadly resolved.
21:30 – We verified end-to-end our service and restored full functionality of the widget.
(Please adjust the timestamps to the actual times from your logs.)
3. Impact
The widget was unavailable or degraded for [insert duration, e.g. “approximately 2.5 hours”].
Some customers may have seen increased page load times, script errors, or user chat fails during the outage.
No data loss or breach occurred. Your website and core services remained live and unaffected.
4. Root cause
The incident was caused by a major outage in AWS’s infrastructure (US-EAST-1 region) which impacted networking and internal services at scale. Because our widget service runs in that environment, we were caught in the disruption. (AWS is publishing a post-event summary.)
The fact that the failure was external to our stack meant we had limited control until upstream resolution.
5. Corrective and preventive actions
During the outage:
We paused the widget to avoid impacting customer sites.
We monitored external status updates and AWS Health Dashboard for real-time updates.
We validated our deployment once the upstream issue was cleared and gradually restored service.
6. What you should do
You don’t need to take any action: the widget is now back online and functioning normally.
If you encounter any abnormal behaviour (e.g., slow loads or script errors), we recommend verifying your site config and contacting us if needed.
7. Note for users who removed the widget script during the outage
If you removed or commented out the widget script for safety during the incident you may now safely re-insert the script and resume use of the live chat widget.
We apologise again for the disruption and appreciate your patience. We take your service-continuity very seriously, and we are committed to delivering higher resilience going forward.
For broader context on the underlying infrastructure outage, you can read this update.
Last updated