Skip to content

Conversation

@wallrj
Copy link
Member

@wallrj wallrj commented Nov 13, 2024

The aim of this PR is to avoid triggering the warnings logged in client-go/tools/cache/reflector.go when a resource group version is not installed in the Kubernetes API server.
An example of such a warning, with --logging-format=json :

{"caller":"cache/reflector.go:561", "msg":"k8s.io/[email protected]/tools/cache/reflector.go:243: failed to list cas-issuer.jetstack.io/v1beta1, Resource=googlecasissuers: the server could not find the requested resource", "ts":1.7320377905120225E12, "v":0}

ℹ️ When --logging-format=json the warning is actually logged as an info message, on stdout.
By default --logging-format=text in which case the warning is logged with a W prefix on stderr.

It is easy enough to list the available APIs and only start the DataGatherers for available resource types,
but it's not easy to maintain the existing behaviour, which is to start collecting data for resources (CRDs) which are installed after the agent has been deployed.
And to stop reporting data for resource types which are removed from the API server after the agent has started.

This problem apparently also affects the kube-controller-manager, and is being discussed in kubernetes/kubernetes#79610

The venafi-kubernetes-agent already supresses the errors by installing a custom WatchErrorHandler, which turns the errors into V(1) info messages:

// Run starts the dynamic data gatherer's informers for resource collection.
// Returns error if the data gatherer informer wasn't initialized, Run blocks
// until the stopCh is closed.
func (g *DataGathererDynamic) Run(stopCh <-chan struct{}) error {
log := klog.FromContext(g.ctx)
if g.informer == nil {
return fmt.Errorf("informer was not initialized, impossible to start")
}
// attach WatchErrorHandler, it needs to be set before starting an informer
err := g.informer.SetWatchErrorHandler(func(r *k8scache.Reflector, err error) {
if strings.Contains(fmt.Sprintf("%s", err), "the server could not find the requested resource") {
log.V(logs.Debug).Info("Server missing resource for datagatherer", "groupVersionResource", g.groupVersionResource)
} else {
log.Info("datagatherer informer has failed and is backing off", "groupVersionResource", g.groupVersionResource, "reason", err)
}
})
if err != nil {
return fmt.Errorf("failed to SetWatchErrorHandler on informer: %s", err)
}
// start shared informer
g.informer.Run(stopCh)
return nil
}

I learned from the klog authors that there is an in progress PR which will remove this distracting warning message:

So the problem will likely be fixed by upgrading to the next or subsequent release of client-go.
Meanwhile we can advise users how to filter out the warnings in their log servers.

So I will close this PR.

Testing

[2024-11-13 09:31:02] Starting [caller=agent/run.go:61 commit=e97e241b763514de15480ce3e0a570a459eb1006 logger=Run version=v1.2.0-13-ge97e241b763514-dirty]
[2024-11-13 09:31:02] Using the Venafi Cloud VenafiConnection auth mode since --venafi-connection was specified. [caller=agent/config.go:395 logger=Run]
[2024-11-13 09:31:02] ignoring the server field specified in the config file. In Venafi Cloud VenafiConnection mode, this field is not needed. [caller=agent/config.go:431 logger=Run]
[2024-11-13 09:31:02] ignoring the venafi-cloud.upload_path field in the config file. In Venafi Cloud VenafiConnection mode, this field is not needed. [caller=agent/config.go:461 logger=Run]
[2024-11-13 09:31:02] ignoring the venafi-cloud.uploader_id field in the config file. This field is not needed in Venafi Cloud VenafiConnection mode. [caller=agent/config.go:479 logger=Run]
[2024-11-13 09:31:02] Using period from config [caller=agent/config.go:531 logger=Run period=1m0s]
[2024-11-13 09:31:02] Metrics endpoints enabled [addr=:8081 caller=agent/run.go:110 logger=Run.APIServer path=/metrics]
[2024-11-13 09:31:02] Healthz endpoints enabled [addr=:8081 caller=agent/run.go:119 logger=Run.APIServer path=/healthz]
[2024-11-13 09:31:02] Readyz endpoints enabled [addr=:8081 caller=agent/run.go:123 logger=Run.APIServer path=/readyz]
[2024-11-13 09:31:02] Skipping DataGatherer [caller=agent/run.go:195 groupVersionResource=cert-manager.io/v1, Resource=certificates logger=Run reason=GroupVersionResource not installed]
[2024-11-13 09:31:02] Skipping DataGatherer [caller=agent/run.go:195 groupVersionResource=cas-issuer.jetstack.io/v1beta1, Resource=googlecasissuers logger=Run reason=GroupVersionResource not installed]
[2024-11-13 09:31:02] Skipping DataGatherer [caller=agent/run.go:195 groupVersionResource=cas-issuer.jetstack.io/v1beta1, Resource=googlecasclusterissuers logger=Run reason=GroupVersionResource not installed]
[2024-11-13 09:31:02] Skipping DataGatherer [caller=agent/run.go:195 groupVersionResource=awspca.cert-manager.io/v1beta1, Resource=awspcaissuers logger=Run reason=GroupVersionResource not installed]
[2024-11-13 09:31:02] Skipping DataGatherer [caller=agent/run.go:195 groupVersionResource=awspca.cert-manager.io/v1beta1, Resource=awspcaclusterissuers logger=Run reason=GroupVersionResource not installed]
[2024-11-13 09:31:02] Skipping DataGatherer [caller=agent/run.go:195 groupVersionResource=networking.istio.io/v1alpha3, Resource=gateways logger=Run reason=GroupVersionResource not installed]
[2024-11-13 09:31:02] Skipping DataGatherer [caller=agent/run.go:195 groupVersionResource=networking.istio.io/v1alpha3, Resource=virtualservices logger=Run reason=GroupVersionResource not installed]
[2024-11-13 09:31:02] Skipping DataGatherer [caller=agent/run.go:195 groupVersionResource=route.openshift.io/v1, Resource=routes logger=Run reason=GroupVersionResource not installed]
[2024-11-13 09:31:02] Skipping DataGatherer [caller=agent/run.go:195 groupVersionResource=firefly.venafi.com/v1, Resource=issuers logger=Run reason=GroupVersionResource not installed]
[2024-11-13 09:31:03] Data sent successfully [caller=agent/run.go:437 logger=Run.gatherAndOutputData.postData]

TODO:

  • Handle DataGatherers which are configured to fetch from remote clusters.
  • Start the DataGatherer later if the API resource is eventually installed.
  • Remove the DataGatherer if the API resource is later deleted.

@wallrj wallrj force-pushed the VC-35738/info-debug-trace branch from 7d19d0a to 1402d52 Compare November 14, 2024 11:41
Base automatically changed from VC-35738/info-debug-trace to VC-35738/feature November 14, 2024 11:46
Base automatically changed from VC-35738/feature to master November 14, 2024 12:26
@wallrj wallrj force-pushed the VC-33564-only-gather-installed-resource-kinds branch from e97e241 to 34c73e3 Compare November 15, 2024 11:28
@wallrj wallrj force-pushed the VC-33564-only-gather-installed-resource-kinds branch from 34c73e3 to 01e9363 Compare November 15, 2024 11:39
for _, dgConfig := range config.DataGatherers {
if c, ok := dgConfig.Config.(*k8s.ConfigDynamic); ok {
gvr := c.GroupVersionResource
if !availableGVRs.Has(gvr.String()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the challenge is that we want to start the DataGatherer when the CRD is installed at a later point in time.

@wallrj
Copy link
Member Author

wallrj commented Nov 20, 2024

I learned from the klog authors that there is an in progress PR which will remove this distracting warning message:

So the problem will likely be fixed by upgrading to the next or subsequent release of client-go.
Meanwhile we can advise users how to filter out the warnings in their log servers.

So I will close this PR.

@wallrj wallrj closed this Nov 20, 2024
@wallrj wallrj changed the title WIP: [VC-35738] Only start DataGatherer if its API resource is available [VC-35738] Only start DataGatherer if its API resource is available Nov 20, 2024
@wallrj wallrj deleted the VC-33564-only-gather-installed-resource-kinds branch November 20, 2024 11:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants