Skip to content

[FLINK-38704][metrics] Fix MetricConfig.getString() to handle numeric values#27789

Open
mukul-8 wants to merge 1 commit intoapache:release-2.1from
mukul-8:backport-FLINK-38704-release-2.1
Open

[FLINK-38704][metrics] Fix MetricConfig.getString() to handle numeric values#27789
mukul-8 wants to merge 1 commit intoapache:release-2.1from
mukul-8:backport-FLINK-38704-release-2.1

Conversation

@mukul-8
Copy link

@mukul-8 mukul-8 commented Mar 20, 2026

Backport of #27743

What is the purpose of the change

Fixes the Prometheus metrics reporter (and potentially other reporters) not loading configured port values.

The issue occurs because numeric YAML configuration values are stored as Integer/Long objects, but Properties.getProperty() only returns Strings. When MetricConfig.getString() is called for numeric values like metrics.reporter.prom.port, it returns null, causing reporters to fall back to default values.

Note: The bug only reproduces when a single number is assigned to the port (e.g., 9999). It works correctly when a port-range is used (e.g., 9000-9100) because ranges are stored as Strings.

Workaround: Quote the port value in YAML configuration: metrics.reporter.prom.port: "9999"

Brief change log

  • Changed MetricConfig.getString() to use get() instead of getProperty(), matching the pattern of getInteger(), getLong(), etc. Added test cases to verify the fix.

Verifying this change

This change fixes the bug where:

  • Configuration loads: metrics.reporter.prom.port=9999
  • But reporter starts on default port 9249 instead

After the fix, the configured port is correctly applied.

Added test case to verify this change is working fine.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented Mar 20, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@mukul-8
Copy link
Author

mukul-8 commented Mar 22, 2026

@flinkbot run azure

1 similar comment
@mukul-8
Copy link
Author

mukul-8 commented Mar 23, 2026

@flinkbot run azure

@mukul-8
Copy link
Author

mukul-8 commented Mar 23, 2026

@1996fanrui @Izeren can you please review, CI failure is not related to this change

Copy link
Contributor

@Izeren Izeren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you for the fix @mukul-8

I see that tests for table planner have failed, but this is unrelated to PR

@github-actions github-actions bot added the community-reviewed PR has been reviewed by the community. label Mar 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community. target:release-2.1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants