Loss of capacity on the API servers
Incident Report for WeTransfer
Resolved
This last attempt was successful, finally. Everything completed in time and the service didn't suffer meanwhile. Our main database is now redundant again. We just ran and re-enabled all the scheduled tasks (for cleaning up expired transfers, sending out mails, etc) that didn't ran during the last few hours. From now on, back to normal. Thank you all for your patience and understanding. Now, happy Monday morning transferring!
Posted about 1 year ago. Jul 11, 2016 - 05:19 UTC
Update
Yes, we are still awake and so are our friends at Amazon AWS! After another attempt that went down and was stopped quickly enough not to affect you too much, we are currently doing a final attempt to get our secondary database instance in sync with the primary. Things are looking bright, but we're looking for the optimal point where the syncing does not break our service and still completes before traffic starts to increase again in the morning.
Posted about 1 year ago. Jul 11, 2016 - 00:45 UTC
Update
Unfortunately the impact was too big and we had to cancel the operation. There was a 10 minute downtime due to this attempt.
Posted about 1 year ago. Jul 10, 2016 - 20:00 UTC
Update
The situation is currently stable but not ideal. Our database server is usually redundant, but only one instance is active right now - so we will need to make another move. The second instance should be turned on as quickly as possible so we'll be running redundant again. All signs for this are green, but (and we hate bringing this to you), doing it might take another performance hit for a short while. We'll be on top of the situation and cancel the operation when the impact is too big.
Posted about 1 year ago. Jul 10, 2016 - 19:38 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted about 1 year ago. Jul 10, 2016 - 12:01 UTC
Identified
Once we have come out of maintenance the load on our database has spiked to very high levels. We have to wait out for the load to stabilise before we re-enable everything.
Posted about 1 year ago. Jul 10, 2016 - 10:28 UTC
Investigating
After the database update this morning our servers have a little bit of trouble coming up all at once. Please bear with us as we give them a helping hand.
Posted about 1 year ago. Jul 10, 2016 - 09:35 UTC