Performance Optimization - 9.5 [Cluster] Redundancy-pattern fault tolerance


While performing the cluster operation, the fault tolerance must be considered. For the single machine operation, if the machine fails, the operation fails. For the cluster operation, however, the failure of only a few nodes may not affect the cluster to continue working.

To be fault tolerant, redundancy must be used. If, as the situations discussed in the previous sections, each node only holds the data of one zone, then the failure of any node will cause the data to be incomplete, thereby making it impossible to perform proper calculation.

The esProc server can store the data of multiple zones at the same time. When a node fails, if all zones can be found on the remaining nodes, then the calculation can still proceed. However, it will increase the amount of tasks executed by some nodes, and the overall computing performance may also decrease.

Still, we take four zones as an example, and let each node hold two zones as follows:

101: zone 1, 2; 102: zone 2, 3; 103: zone 3, 4; 104: zone 4, 1

Let’s examine the previous code:

1 [“”,“”,…, “”]
2 =file(“orders.ctx”:[1,2,3,4],A1)

If all nodes run normally, the file() function in A2 will take zone 1 on 101, take zone 2 on 102,…, and take zone 4 on 104 to constitute a cluster file. If 103 fails, resulting in the failure in searching zone 3 on 103, it will continue to search on 104. If zone 3 is still not found, it will turn back to 101 to search, and will eventually find and take zone 3 on 102, and then take zone 4 on 104. As a result, the four zones of this cluster file will be taken from 101, 102, 102, and 104 respectively, and 102 will execute the operation of two zones.

Such scheme using the data by multiple times to implement fault tolerance is called the redundancy-pattern fault tolerance.

There is a problem here. According to our previous assumption that when each node holds only one zone, each zone of the multi-zone composite table of cluster can only correspond to unique node. But if there are redundant zones on the node, this correspondence is not unique. For example, we can take out four zones from 101, 102, 103 and 104 respectively, and we can also take the same zones from 104, 101, 102 and 103. For the single-table operation, how zones are distributed will not affect the calculation results, but for the multi-table association operation (like the homo-dimension tables association), a same zone distribution method is required to continue the operation, because zone 1 on 101 and zone 1 on 104 cannot be associated.

1 [“”,“”,…, “”]
2 =file(“A.ctx”:[1,2,3,4],A1)
3 =file(“B.ctx”,A2)

The file() function can generate a new cluster file according to the known zone distribution scheme of a certain cluster file, and A3 will adopt the same zone distribution as A2. In this way, the association calculation can be performed based on the constituted cluster table.

Let’s continue to examine the distribution scheme just mentioned:

101: zone 1, 2; 102: zone 2, 3; 103: zone 3, 4; 104: zone 4, 1

If both 103 and 104 fail, not all zones can be found on 101 and 102, which makes it impossible to continue to calculate. In other words, such redundant distribution scheme only tolerates the failure of one node. It is found after a closer analysis that this scheme can continue to work when any node fails, but can’t work once any two nodes fail.

We call the fault tolerance of this distribution scheme as 1, while for the scheme with only one zone for one node in the previous example, its fault tolerance is 0. For the cluster with n nodes, if each node has all data, its fault tolerance is n-1. When the fault tolerance is k (the operation can still be performed after any k nodes fail; the cluster fails when k+1 nodes fail), the data is required to be duplicated by k times (that is, in addition to one original data, k more copies should be stored). The fault tolerance is sometimes called redundancy.

For the cluster with n nodes, we can simply use the loop distribution scheme to achieve k fault tolerance:

Node 1: zone 1, zone 2,…, zone k+1
Node 2: zone 2, zone 3,…, zone k+2
Node i: zone i, zone i+1,…, zone i+k
Node n: zone n, zone 1,…, zone k

where, zone m is equivalent to zone m-n when m>n.