Skip to content

Conversation

@Kritavya
Copy link

This PR adds a lightweight /v1/metrics endpoint that exposes basic server
status information in JSON format.

The endpoint reports:

  • server status
  • uptime since startup
  • static system configuration info (captured once at startup)

The existing /metrics endpoint remains unchanged and continues to provide
Prometheus-format performance metrics. The new endpoint is intended for
human-readable status checks and operational debugging, similar in spirit to
/v1/health, but with uptime information included.

Changes are limited to tools/server/ and do not affect inference or request
handling paths.

@ngxson
Copy link
Collaborator

ngxson commented Dec 14, 2025

any reasons why these info cannot exposed in /props instead?

@Kritavya
Copy link
Author

I did consider exposing this via /props, but opted for a separate /v1/metrics
endpoint mainly because /props currently returns a fairly large configuration
payload (generation parameters, chat templates, model metadata, etc., including
large string fields).

Uptime is a runtime value that may be polled frequently, and putting it under
/props would force clients to repeatedly fetch and parse heavy static
configuration data, which breaks caching assumptions for what is otherwise a
configuration-discovery endpoint.

Keeping runtime status lightweight and separate felt cleaner, but I’m happy to
adapt if maintainers prefer another structure.

@ngxson
Copy link
Collaborator

ngxson commented Dec 14, 2025

first off, the /v1 is reserved for openai-compatible API, and /v1/metrics is definitely not an API schema that exist somewhere else on the internet.

secondly, the goal of an API (application programming interface) is not to be readable, otherwise it should be called Human Interface.

there are already /metrics, /props, /slots for monitoring server status. I don't think it's necessary to add another endpoint, which will make both users and maintainers to be confused

Uptime is a runtime value that may be polled frequently

grafana and /metrics endpoint exist for that reason

@Kritavya
Copy link
Author

Thanks for the detailed feedback — that makes sense.

I agree that /v1/* should remain reserved for OpenAI-compatible APIs, and that
adding a non-OpenAI endpoint under that namespace is inconsistent with the
existing API structure.

My intent was mainly to expose a lightweight runtime signal (uptime / online
status), but I understand that /metrics, /props, and /slots already cover
server observability concerns and that introducing another endpoint may add
confusion.

I’m happy to adapt this in whatever direction maintainers prefer — whether that
means moving the information under an existing endpoint, changing the path, or
dropping the endpoint entirely if it’s not aligned with the project’s direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants