[FLINK-27773][Web Dashboard] Top N Metrics Dashboard#27774
Open
featzhang wants to merge 5 commits intoapache:masterfrom
Open
[FLINK-27773][Web Dashboard] Top N Metrics Dashboard#27774featzhang wants to merge 5 commits intoapache:masterfrom
featzhang wants to merge 5 commits intoapache:masterfrom
Conversation
This commit introduces a Top N Metrics Dashboard to the Flink Web UI, providing visibility into resource-intensive components: - Top N CPU Consumers: Identify tasks with highest CPU usage - Top N Backpressure Operators: Highlight operators experiencing backpressure - Top N GC Intensive Tasks: Show tasks with highest GC overhead The implementation includes: - REST API endpoint: /jobs/:jobid/metrics/top-n - Response body with three metric categories - Angular components for displaying metrics - Demo page showcasing the feature This feature helps operators quickly identify performance bottlenecks and optimize job execution.
… architecture This commit fixes fundamental architectural issues in the Top N Metrics Dashboard implementation: 1. Fixed REST Handler inheritance - Now properly extends AbstractRestHandler instead of using incorrect base class 2. Fixed MessageHeaders implementation - Now implements RuntimeMessageHeaders with correct method signatures 3. Fixed MetricStore access - Using public APIs (getRepresentativeAttempts, getAllSubtaskMetricStores) instead of attempting to access private members 4. Fixed HTTP method references - Using HttpMethodWrapper instead of non-existent HttpMethod 5. Added proper logging - Added Logger instance for better error tracking 6. Added handler registration - Registered TopNMetricsHandler in WebMonitorEndpoint 7. Moved response body to correct package - Moved from legacy.messages to proper job.metrics package The implementation now follows Flink's standard REST API architecture pattern and properly interacts with the MetricStore system through public APIs.
Collaborator
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Purpose
This PR fixes fundamental architectural issues in the Top N Metrics Dashboard implementation that were identified during CI analysis. The previous implementation had critical design flaws that prevented it from working correctly.
Changes
AbstractRestHandlerinstead of using incorrect base classRuntimeMessageHeaderswith correct method signatures (getRequestClass, getResponseClass, getResponseStatusCode, getHttpMethod)metricStore.getRepresentativeAttempts()to get job taskstaskMetricStore.getAllSubtaskMetricStores()to get subtasksHttpMethodWrapperinstead of non-existentHttpMethodLoggerinstance for better error trackingTopNMetricsHandlerinWebMonitorEndpointlegacy.messagesto properjob.metricspackageImplementation Details
The implementation now follows Flink's standard REST API architecture pattern:
AbstractRestHandler<RestfulGateway, EmptyRequestBody, TopNMetricsResponseBody, TopNMetricsMessageParameters>MetricFetcherintegrationVerifying this change
Documentation
This adds a new REST endpoint:
GET /jobs/:jobid/metrics/top-nthat returns Top N metrics for a job.Notes
The previous PR #27771 was closed due to fundamental architectural issues. This implementation addresses all identified issues and follows Flink's standard patterns.