Date | October 4th, 2019 |
Description | Customers in some regions unable to access Silo |
Duration | 5 hours |
Affected components | New Silo sessions were returning the error “Server is unavailable. / Unable to start session” |
Affected customers | All customers |
Root cause analysis | Authentic8 Application Servers use a package named ClamAV for scanning file downloads to user’s local machines from Silo Secure Storage or the Internet. An unattended-update occurred and automatically upgraded the ClamAV package which in turn removed a crucial part of ClamAV required to run on Authentic8 Application Servers. Unattended-updates are used across all Authentic8 environments to ensure security updates are automatically patched as soon as possible. However, the net effect of the ClamAV security patch was any Authentic8 Application Server that received the update stopped loading new sessions. The patch was not registered as faulty in Authentic8’s monitoring systems since there was no apparent error until users attempted to connect. |
Event Log
On 10/04/2019 at 09:28 GMT Authentic8 began receiving reports from various customers that were unable to connect to the service, internal support staff began to investigate the reports. All system monitors indicated normal operation.
On 10/04/2019 at 12:30 GMT Authentic8 began to triage the incident between the support staff and operations teams. The issue was replicable by both parties and it was confirmed that there was indeed and isolated anomaly.
On 10/04/2019 at 1:21 GMT the Authentic8 Operations Team identified the issue and began to dig deeper. A few moments later it was confirmed that the unattended update was the cause of the problem and steps towards a resolution began.
Resolution
On 10/04/2019 at 2:48 GMT the Authentic8 Operations Team gave the all clear after deploying a patch to all affected Authentic8 Application Servers.
Authentic8 knows with great certainty of the events that led to the issue. The unattended-update had indeed caused an error on the root process running on Authentic8’s Application Servers and was no longer compatible. The incompatibility issue of the new version with the unattended-update ultimately caused this issue to occur by denying users access to new sessions.
Moving Forward
Authentic8 has some takeaways from this event. While this event was small in scale (only 13% of total Authentic8 Application Servers affected), Authentic8 will take more rigorous action when applying unattended-updates to processes running on our application servers.