ES選舉-類Bully算法


Bully算法

bully算法是一個分佈式系統中動態選擇master節點的算法,進程號最大的非失效的節點將被選為master。
算法用三種消息類型:

1)選舉消息 (Election Message: Sent to announce election.)
2)應答消息(Answer (Alive) Message: Responds to the Election message.)
3)選舉成功消息 (Coordinator (Victory) Message: Sent by winner of the election to announce victory.)

當一個進程P從失敗中恢復,或者接收到主節點失效信息,進程P將做以下事情:

1)如果進程P有最大的進程ID,那麼它則會向其他節點廣播Coordinator (Victory) Message。否則進程P向進程號比它大的進程發送Election Message
2)如果進程P發送Election Message後,沒有接收到應答,它就會向其他節點廣播Coordinator (Victory) Message,併成為master。
3)如果進程P接收到比它進程號更高的進程的Answer (Alive) Message信息,那麼它這次的選舉就失敗了,等待接收其他節點的Coordinator (Victory) Message。
4)如果進程P接收到比它進程號低的進程的Election message,那麼它會向該節點返回一個Answer (Alive) Message,並啟動選舉進程,並向比它更高的進程發送Election Message。
5)如果進程P接收到Coordinator (Victory) Message,那麼它就會把發送這條消息的節點看作為master進程。

ES Master選舉過程

我看的源碼是5.6版本的。因此以下的解釋都是依據5.6的源碼來說的。
當master節點失效之後,各個節點find master的過程如下:
1)每個節點會ping配置文件中discovery.zen.ping.unicast.hosts的IP,找到所有存活的節點並過濾
2)找到非本身的active master

 List<discoverynode> activeMasters = new ArrayList<>();
for (ZenPing.PingResponse pingResponse : pingResponses) {
// We can't include the local node in pingMasters list, otherwise we may up electing ourselves without
// any check / verifications from other nodes in ZenDiscover#innerJoinCluster()
if (pingResponse.master() != null && !localNode.equals(pingResponse.master())) {
activeMasters.add(pingResponse.master());
}
}
/<discoverynode>

3)找到所有的可成為master的節點集合masterCandidates ,包含自己

 // nodes discovered during pinging
List<electmasterservice.mastercandidate> masterCandidates = new ArrayList<>();
for (ZenPing.PingResponse pingResponse : pingResponses) {
if (pingResponse.node().isMasterNode()) {
masterCandidates.add(new ElectMasterService.MasterCandidate(pingResponse.node(), pingResponse.getClusterStateVersion()));
}
}
/<electmasterservice.mastercandidate>

4)如果activeMasters 為空,也就是說不存在活著的master節點,同時當前活著的節點滿足配置中discovery.zen.minimum_master_nodes的數量,那麼就從masterCandidates 挑出ID最小的節點,讓其成為master節點。如果activeMasters 不為空,則從中選擇最小的ID成為Master節點即可。

if (activeMasters.isEmpty()) {
if (electMaster.hasEnoughCandidates(masterCandidates)) {
final ElectMasterService.MasterCandidate winner = electMaster.electMaster(masterCandidates);
logger.trace("candidate {} won election", winner);
return winner.getNode();

} else {
// if we don't have enough master nodes, we bail, because there are not enough master to elect from
logger.warn("not enough master nodes discovered during pinging (found [{}], but needed [{}]), pinging again",
masterCandidates, electMaster.minimumMasterNodes());
return null;
}
} else {
assert !activeMasters.contains(localNode) : "local node should never be elected as master when other nodes indicate an active master";
// lets tie break between discovered nodes
return electMaster.tieBreakActiveMasters(activeMasters);
}

electMaster.electMaster方法和electMaster.tieBreakActiveMasters方法則都是從集合中選取最小節點的ID:

 public MasterCandidate electMaster(Collection<mastercandidate> candidates) {
assert hasEnoughCandidates(candidates);
List<mastercandidate> sortedCandidates = new ArrayList<>(candidates);
sortedCandidates.sort(MasterCandidate::compare);
return sortedCandidates.get(0);
}
public DiscoveryNode tieBreakActiveMasters(Collection<discoverynode> activeMasters) {
return activeMasters.stream().min(ElectMasterService::compareNodes).get();
}
/<discoverynode>/<mastercandidate>/<mastercandidate>

如果當前不存在active master,那麼activeMasters 一定為空,則從masterCandidates 從選出id最小的節點即可。
如果當前存在active master,且當前節點不是active maste,那麼從activeMasters 中選出id最小的節點。
如果當前存在active master,且當前節點是active maste,那麼activeMasters 為空,從masterCandidates 中選出id最小的節點即自己。


ES選舉-類Bully算法


分享到:


相關文章: