Skip to content

omit DescribeVersion API calls for drained versions#229

Open
Shivs11 wants to merge 3 commits intomainfrom
shivam/reduce-describe-version-calls
Open

omit DescribeVersion API calls for drained versions#229
Shivs11 wants to merge 3 commits intomainfrom
shivam/reduce-describe-version-calls

Conversation

@Shivs11
Copy link
Member

@Shivs11 Shivs11 commented Mar 18, 2026

Summary

  • Switch GetWorkerDeploymentState from the SDK's Describe() to the gRPC DescribeWorkerDeployment API, which returns full DrainageInfo (including DrainedSince timestamps) in the version summaries
  • Eliminate per-version DescribeVersion calls for obtaining drainage timestamps — read them directly from the gRPC summary
  • Gate the remaining DescribeVersion call (for poller checks) on the k8s Deployment existing at 0 replicas; skip entirely if the k8s Deployment has been deleted
  • For a deployment with 1 current + 99 drained versions (no k8s Deployments), this reduces API calls per reconciliation from ~101 to ~2

Test plan

  • Verify existing unit tests pass (go test ./internal/temporal/...)
  • Verify integration tests pass (go test ./internal/tests/...)
  • Verify DrainedSince is correctly populated from the gRPC summary
  • Verify NoTaskQueuesHaveVersionedPoller still works for drained versions with k8s Deployments at 0 replicas
  • Verify drained versions without k8s Deployments no longer trigger DescribeVersion calls

🤖 Generated with Claude Code

Switch GetWorkerDeploymentState from the SDK's Describe() to the gRPC
DescribeWorkerDeployment API, which returns full drainage info (including
DrainedSince timestamps) in the version summaries. This eliminates
per-version DescribeVersion calls just to obtain drainage timestamps.

Additionally, gate the remaining DescribeVersion call (for poller checks)
on the k8s Deployment existing at 0 replicas. Drained versions whose k8s
Deployments have already been deleted no longer trigger any DescribeVersion
calls, since there is no action to take for them.

For a deployment with 1 current + 99 drained versions (no k8s Deployments),
this reduces API calls per reconciliation from ~101 to ~2.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Shivs11 Shivs11 requested review from a team and jlegrone as code owners March 18, 2026 20:48
@Shivs11 Shivs11 changed the title perf: reduce DescribeVersion API calls for drained versions omit DescribeVersion API calls for drained versions Mar 18, 2026
state.RampingSince = &rampingSinceTime
state.RampLastModifiedAt = &lastRampUpdateTime
if routingConfig.RampingDeploymentVersion != nil {
if routingConfig.RampingVersionChangedTime != nil {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know the server would never set a ramping version to be a non-nil without a ramping changed time. However, now, we are directly using the gRPC layer to communicate with the server. The SDK, prior to this PR, was smart enough to convert nil timestamps (if ever present) to time.Time values (nil-safe)

we are now playing with protobuf pointers, and the thought process was to be safe than sorry.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ @Shivs11 it's good to be safe, here.

Comment on lines +221 to +222
// stubWorkflowServiceClient implements workflowservice.WorkflowServiceClient, returning
// a NotFound error for DescribeWorkerDeployment (matching the default stub behavior).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't look like the return from DescribeWorkerDeployment is a NotFound :) Looks like the return is a workflowservice.DescribeWorkerDeploymentResponse to me!

state.RampingSince = &rampingSinceTime
state.RampLastModifiedAt = &lastRampUpdateTime
if routingConfig.RampingDeploymentVersion != nil {
if routingConfig.RampingVersionChangedTime != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++ @Shivs11 it's good to be safe, here.

Comment on lines +150 to +155
version.DeploymentVersion.DeploymentName == routingConfig.CurrentDeploymentVersion.DeploymentName &&
version.DeploymentVersion.BuildId == routingConfig.CurrentDeploymentVersion.BuildId {
versionInfo.Status = temporaliov1alpha1.VersionStatusCurrent
} else if routingConfig.RampingVersion != nil &&
version.Version.DeploymentName == routingConfig.RampingVersion.DeploymentName &&
version.Version.BuildID == routingConfig.RampingVersion.BuildID {
} else if routingConfig.RampingDeploymentVersion != nil &&
version.DeploymentVersion.DeploymentName == routingConfig.RampingDeploymentVersion.DeploymentName &&
version.DeploymentVersion.BuildId == routingConfig.RampingDeploymentVersion.BuildId {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is version.DeploymentVersion guaranteed to be non-nil? If not, let's add a nil-guard somewhere to this code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants