TM It is impossible to bypass the KDC login process, yet the TOKEN issued by JM has not been actually utilized.#27795
Open
liangrui198 wants to merge 1 commit intoapache:masterfrom
Open
TM It is impossible to bypass the KDC login process, yet the TOKEN issued by JM has not been actually utilized.#27795liangrui198 wants to merge 1 commit intoapache:masterfrom
liangrui198 wants to merge 1 commit intoapache:masterfrom
Conversation
…sued by JM has not been actually utilized.
Collaborator
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
issues: https://issues.apache.org/jira/browse/FLINK-39274
Contribution Checklist
Currently, when there are a large number of Flink batch tasks or short-term small tasks, the KDC is under excessive pressure, which causes the KDC service to become sluggish.
The reason is that the TM of Flink does not reuse the TOKEN distributed by the JM, but re-logs in using the keytab. The code's judgment is relatively crude and direct, using whether the principal is null to determine whether to perform keytab login. However, the keytab configurations of JM and TM are shared, which is a conflict point here.
If my Flink is a batch task and only accesses the HDFS service and requires Kerberos authentication, the TM does not need to perform the operations of logging in to the KDC and renewing the TOKEN at all. It can directly use the TOKEN downloaded by the JM.
What is the purpose of the change
Add logical restrictions to prevent repeated login to the TM KDC. Since I clearly observed that JM has issued a valid token, it is reasonable to implement the same optimization effect as the Spark mechanism.
Verifying this change
HadoopModuleTest.java
add keytabLoginDisabledShouldSkipKdcLogin test
add keytabLoginEnabledByDefaultShouldPerformKdcLogin test
Does this pull request potentially affect one of the following parts:
security.kerberos.login.keytab-login.enabled By default, it is set to true. The existing logic remains unchanged. Instead, it will be deactivated for those who need it.
Documentation
flink\docs\content.zh\docs\deployment\security\security-kerberos.md
flink\docs\content\docs\deployment\security\security-kerberos.md
add security.kerberos.login.keytab-login.enabled