2017/02/15 Dynalist outage

Dynalist was unavailable from 6:00 am to 7:30 am EDT on 2017/02/15. It was up for brief moments during that time but it was mostly unavailable.


Original post:

It looked like we only need a simple restart to make it up and running again, but soon it will go back to 500 Server Error.

We’re looking into it – if you’re seeing the error, please do not keep refreshing the page as that would only make it more difficult to restart the server.

Thanks & sorry for the inconvenience!

It should be back up now.

My current understanding of the problem is that our database is configured too strictly on memory usage, and thus it was past a point where all indices fit into memory, and the core sync queries started doing full-table scans. This caused some requests to take more 60 seconds, which was terminated and later restarted, resulting in a continuous cycle that kept our servers down.

Our current fix trimmed the indices on the table responsible for the issue, and we’ll be raising the memory limit to allow for future growth.

I want to add that our immediate plan for the next few weeks was to work on optimizing server resource usage, now that Google Calendar sync and mobile apps are mostly in place.

The timing of this incident is unfortunately bad, but we learned our lesson from it and it will motivate us to better monitor and optimize our services in the future.

Apologies!

1 Like

Are you sure you are not running any ‘Backend Job process’ on 15th? The screenshot looks that way

Not as far as I know. The issues causing both outages are unrelated, so I would say it’s a coincidence that it happened on the 15th…