RATIS-2428. Allow installation of Snapshot from followers#1370
Open
devabhishekpal wants to merge 5 commits intoapache:masterfrom
Open
RATIS-2428. Allow installation of Snapshot from followers#1370devabhishekpal wants to merge 5 commits intoapache:masterfrom
devabhishekpal wants to merge 5 commits intoapache:masterfrom
Conversation
…t of source peers plus the minimum acceptable snapshot index.
…rn best candidate
Author
|
Hi @szetszwo, could you take a look at the proposal and code changes? I wasn't sure what config category to put this under |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Problem
In the current implementation, the leader provides the snapshot. However this causes tasks to get paused until the snapshot installation is completed and also puts unnecessary pressure on the leader.
Goal
Allow a lagging follower to install a snapshot from another follower without making that source follower act as the leader. This will let the lagging follower stay either in sync or "catch-up" to the point where it can append new entries without a complete snapshot.
High Level Flow Diagram
How do we select the follower source?
For selecting the follower source we consider the following metrics / conditions.
Inputs
T) which needs snapshotlastEntry,logStartIndexandfirstAvailableTermIndexmatchIndex,commitIndex,snapshotIndex,lastRespondedAppendEntriesSendTimeandlastRpcResponseTime.Eligibility
A follower (
F) is considered a source only if:Fis recently responsive on the append path, for this we can uselastRespondedAppendEntriesSendTimeas a check and fallback tolastRpcResponseTime.F.matchIndex >= requiredSnapshotIndexwhererequiredSnapshotIndex = firstAvailableTermIndex.index - 1This is because:
requiredSnapshotIndexis the minimum snapshot index that still lets the target resume normal AppendEntries from the leader after install.F.matchIndex < requiredSnapshotIndex, that follower is too far behind to bridge the leader's log gap for this target, so the leader should not choose it.Ranking
Rank eligible followers by this lexicographic order:
matchIndexcommitIndexin case match index is tiedlastRespondedAppendEntriesSendTimeIMPORTANT: If no follower satisfies
matchIndex >= requiredSnapshotIndex, do not attempt follower-sourced install. Fall back immediately to the existing leader path because otherwise the target follower will need to perform another snapshot install to catchup anyway.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/RATIS-2428
How was this patch tested?
Patch was tested using the unit tests.