Window Time: 45 minutes between 11:00am – 12:00pm
A CPU has failed in one of the hardware nodes, so we need to shut the server down and replace the failed CPU.
We have already evacuated all VPS’s off the hardware node in question, so they will be unaffected.
The hardware node does, however, contain 6 TB of SSD storage for the cluster. There are 3 copies of all data, so when we shut down the node, the VPS servers will continue to function normally with data from the other two data copies.
The cluster will, however, start replicating all of the data on the stopped hardware in order to maintain 3 copies and will spread that data around all of the other nodes.
This replication of data will cause a high load on the storage cluster causing higher than normal data access latency. This may cause some higher than normal load on the VPS servers until the replication process is completed.
Therefore, during this maintenance window, the VPS servers should be considered at risk of higher than normal load and increased response times.