Major capacity loss on the web servers
Incident Report for WeTransfer
Postmortem

The incident started around 10:30 UTC. We routinely update our dependencies to keep our users, and our servers, secure. A dependency on encryption libraries was silently introduced one of the modules used by ansible, the tool we use for provisioning our servers. This is an indirect dependency that we do not have control over. This dependency made it impossible for us to start new machines, because essential provisioning tasks were failing. We have found the cause, pinned our ansible installation to specific versions, and opened an issue with the ansible project on GitHub so that the problem can be resolved upstream. We also pinned our provisioning systems to versions that guarantee deployment, and added the encryption libraries to our deployment manifests.

Posted Apr 29, 2016 - 12:32 UTC

Resolved
The issue has been resolved, we are going to place a post-mortem explanation.
Posted Apr 29, 2016 - 12:25 UTC
Monitoring
Everything is operational. We're monitoring as the servers are coming up. Post-mortem will follow.
Posted Apr 29, 2016 - 11:30 UTC
Identified
We are currently having issues with web server capacity. The cause has been identified and we are working on resolving the issue.
Posted Apr 29, 2016 - 10:08 UTC