-
Notifications
You must be signed in to change notification settings - Fork 428
CNTRLPLANE-1137: Add support for external CNI provider Cilium #7423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@wewang58: This pull request references CNTRLPLANE-1137 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
WalkthroughAdds Cilium v1.14.5 manifests and e2e support: CRD, namespace, OLM deployment, RBAC, connectivity test manifests, CLI flag for external CNI, e2e test, and helper utilities to install/validate Cilium on HostedClusters. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes ✨ Finishing touches
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: wewang58 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
@heliubj18 I add all templates to assets, can you help to review it, thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (6)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml (1)
61-64: Consider adding resource limits for completeness.The deployment specifies resource requests but no limits. While acceptable for e2e test infrastructure, adding limits would prevent unbounded resource consumption if the operator misbehaves.
🔎 Proposed enhancement
resources: requests: cpu: 100m memory: 512Mi + limits: + cpu: 500m + memory: 1Gitest/e2e/util/util.go (1)
4198-4211: Consider adding a timeout for HTTP requests.
http.Getuses the default HTTP client which has no timeout. In flaky network conditions, this could cause tests to hang indefinitely.🔎 Proposed fix to add HTTP timeout
+ httpClient := &http.Client{ + Timeout: 30 * time.Second, + } + if strings.HasPrefix(pathOrURL, "http://") || strings.HasPrefix(pathOrURL, "https://") { // Download YAML content - resp, err := http.Get(pathOrURL) + resp, err := httpClient.Get(pathOrURL) if err != nil { return fmt.Errorf("failed to download manifest from %s: %w", pathOrURL, err) }test/e2e/util/cilium.go (4)
161-161: UseCiliumNamespaceconstant for consistency.The constant
CiliumNamespaceis defined but a hardcoded string"cilium"is used here.🔎 Proposed fix
- err := guestClient.Get(ctx, crclient.ObjectKey{Name: "cilium"}, ciliumNs) + err := guestClient.Get(ctx, crclient.ObjectKey{Name: CiliumNamespace}, ciliumNs)
320-333: Consider adding parentheses for clarity.The condition logic is correct due to Go's operator precedence (
&&before||), but explicit parentheses would improve readability.🔎 Proposed fix
- if ds.Name == "cilium" || strings.HasPrefix(ds.Name, "cilium-") && !strings.Contains(ds.Name, "operator") { + if ds.Name == "cilium" || (strings.HasPrefix(ds.Name, "cilium-") && !strings.Contains(ds.Name, "operator")) {
470-493: Fixed sleep duration may be fragile.Using
time.Sleepwith a fixed duration doesn't adapt to varying cluster conditions. Consider polling for specific success indicators if the connectivity test provides them, or document why a fixed duration is necessary.
556-558: Redundant helper function - useptr.Toinstead.The
ptrBoolhelper is redundant sincek8s.io/utils/ptris already imported (line 28) andptr.To[bool](false)or simplyptr.To(false)can be used.🔎 Proposed fix
Remove
ptrBoolfunction and update the call site:- AllowPrivilegeEscalation: ptrBool(false), + AllowPrivilegeEscalation: ptr.To(false),And remove lines 556-558:
-func ptrBool(b bool) *bool { - return &b -}
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (20)
test/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00000-cilium-namespace.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00001-cilium-olm-serviceaccount.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00003-cilium-olm-service.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00004-cilium-olm-leader-election-role.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00006-leader-election-rolebinding.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00010-cilium-cilium-olm-clusterrolebinding.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yamltest/e2e/assets/cilium/v1.14.5/connectivity-check.yamltest/e2e/cilium_network_policy_test.gotest/e2e/e2e_test.gotest/e2e/util/cilium.gotest/e2e/util/hypershift_framework.gotest/e2e/util/options.gotest/e2e/util/util.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00003-cilium-olm-service.yamltest/e2e/util/options.gotest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yamltest/e2e/util/hypershift_framework.gotest/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00000-cilium-namespace.yamltest/e2e/assets/cilium/v1.14.5/connectivity-check.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yamltest/e2e/e2e_test.gotest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yamltest/e2e/cilium_network_policy_test.gotest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yamltest/e2e/util/util.gotest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00004-cilium-olm-leader-election-role.yamltest/e2e/util/cilium.gotest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00010-cilium-cilium-olm-clusterrolebinding.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00006-leader-election-rolebinding.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00001-cilium-olm-serviceaccount.yaml
🧬 Code graph analysis (4)
test/e2e/util/options.go (1)
api/hypershift/v1beta1/hostedcluster_types.go (6)
NonePlatform(1191-1191)AWSPlatform(1188-1188)KubevirtPlatform(1200-1200)AzurePlatform(1203-1203)PowerVSPlatform(1206-1206)OpenStackPlatform(1209-1209)
test/e2e/util/hypershift_framework.go (5)
test/e2e/util/external_oidc.go (1)
ExtOIDCConfig(53-70)api/hypershift/v1beta1/hostedcluster_types.go (3)
NonePlatform(1191-1191)NetworkType(1166-1166)Other(1179-1179)support/util/visibility.go (1)
IsPrivateHC(34-44)test/e2e/util/util.go (1)
WaitForGuestClient(325-373)test/e2e/util/cilium.go (1)
InstallCilium(151-342)
test/e2e/cilium_network_policy_test.go (5)
api/hypershift/v1beta1/hostedcluster_types.go (1)
AzurePlatform(1203-1203)test/e2e/util/hypershift_framework.go (1)
NewHypershiftTest(66-78)support/azureutil/azureutil.go (1)
IsAroHCP(217-219)test/e2e/util/util.go (1)
WaitForGuestClient(325-373)test/e2e/util/cilium.go (1)
EnsureCiliumConnectivityTestResources(346-500)
test/e2e/util/cilium.go (3)
support/azureutil/azureutil.go (1)
IsAroHCP(217-219)test/e2e/util/util.go (3)
ApplyYAML(4194-4264)WaitForDeploymentAvailable(4163-4175)WaitForDaemonSetReady(4178-4190)support/util/util.go (1)
DeleteIfNeeded(142-144)
🪛 Checkov (3.2.334)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml
[medium] 1-75: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 1-75: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 1-75: Containers should not share the host network namespace
(CKV_K8S_19)
test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml
[medium] 3-58: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 3-58: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 59-115: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 59-115: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 116-180: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 116-180: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 116-180: Containers should not share the host network namespace
(CKV_K8S_19)
[medium] 181-236: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 181-236: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 237-292: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 237-292: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 293-338: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 293-338: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 339-394: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 339-394: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 395-450: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 395-450: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 451-516: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 451-516: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 517-582: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 517-582: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 583-649: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 583-649: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 583-649: Containers should not share the host network namespace
(CKV_K8S_19)
[medium] 650-716: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 650-716: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 650-716: Containers should not share the host network namespace
(CKV_K8S_19)
[medium] 717-782: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 717-782: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 783-848: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 783-848: Minimize the admission of root containers
(CKV_K8S_23)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml
[medium] 1-82: Minimize wildcard use in Roles and ClusterRoles
(CKV_K8S_49)
🪛 golangci-lint (2.5.0)
test/e2e/util/cilium.go
[error] 228-228: : # github.com/openshift/hypershift/sync-global-pullsecret [github.com/openshift/hypershift/sync-global-pullsecret.test]
sync-global-pullsecret/sync-global-pullsecret_test.go:228:23: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:234:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:247:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:257:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:270:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:283:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:296:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:309:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:327:12: undefined: NewMockdbusConn
(typecheck)
🔇 Additional comments (26)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yaml (1)
1-13: LGTM!The RoleBinding correctly binds the cilium-olm ServiceAccount to the cilium-olm Role within the cilium namespace. The manifest structure is valid and follows Kubernetes RBAC conventions.
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00006-leader-election-rolebinding.yaml (1)
1-13: LGTM!The RoleBinding correctly enables leader election for the cilium-olm ServiceAccount by binding it to the leader-election Role within the cilium namespace. The manifest structure is valid.
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00010-cilium-cilium-olm-clusterrolebinding.yaml (1)
1-12: LGTM! Note AI summary discrepancy.The ClusterRoleBinding correctly references the ClusterRole
cilium-cilium-olm(defined in cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yaml) and binds it to the cilium-olm ServiceAccount. The manifest structure is valid.Note: The AI summary incorrectly states this binds ClusterRole "cilium-cilium", but the code correctly references "cilium-cilium-olm".
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00000-cilium-namespace.yaml (1)
1-11: LGTM!The Namespace configuration is appropriate for a system-level CNI namespace:
- Empty node-selector allows flexible pod scheduling
- Cluster logging and monitoring labels enable observability
- Run-level "0" indicates this is an early-lifecycle system namespace
The manifest structure is valid and follows OpenShift conventions.
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yaml (1)
14-28: This ClusterRole is sourced from the upstreamisovalent/olm-for-ciliumproject and follows the expected pattern for Helm operators, which require broad RBAC permissions to orchestrate application deployment and lifecycle management. The broad CRUD permissions on RBAC resources (roles, rolebindings, clusterroles, clusterrolebindings) are necessary for cilium-olm to create and manage Cilium's own RBAC requirements during cluster setup. Since this is a test fixture derived from upstream, the scope and permissions are already vetted by the Cilium project.test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00001-cilium-olm-serviceaccount.yaml (1)
1-7: LGTM!The ServiceAccount manifest follows standard Kubernetes conventions and is correctly structured for use by the cilium-olm deployment.
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00003-cilium-olm-service.yaml (1)
1-14: LGTM!The Service manifest is correctly structured with appropriate port mapping (443→9443) for the cilium-olm component. The selector properly targets pods with the
name: cilium-olmlabel.test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00004-cilium-olm-leader-election-role.yaml (1)
1-24: LGTM!The leader election Role follows standard Kubernetes patterns with appropriate permissions for managing ConfigMap-based leader election leases and creating events.
test/e2e/util/options.go (2)
94-95: LGTM!The
ExternalCNIProviderfield is properly documented and follows the existing pattern for optional feature flags in the Options struct.
206-212: LGTM!The
ExternalCNIProviderfield is correctly propagated toPlatformAgnosticOptions, maintaining consistency with how other platform-specific options are handled inDefaultClusterOptions.test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml (1)
1-82: LGTM - Standard Cilium operator RBAC.The wildcard permissions on
cilium.ioresources and CRDs are expected for a CNI operator that needs to manage its own custom resources. The remaining rules are appropriately scoped. This aligns with upstream Cilium installation patterns and is acceptable for e2e test infrastructure.test/e2e/cilium_network_policy_test.go (1)
27-52: LGTM - Well-structured test with proper skip conditions and cleanup.The test correctly:
- Runs in parallel
- Skips appropriately when not on Azure platform or when Cilium isn't the external CNI provider
- Performs ARO HCP environment check inside the test harness
- Defers cleanup of connectivity test resources
test/e2e/util/hypershift_framework.go (1)
51-52: LGTM - New fields extend platform options appropriately.The
ExternalCNIProviderfield follows the existing pattern established byExtOIDCConfig.test/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yaml (1)
1-41: LGTM - Standard CiliumConfig CRD definition.The CRD follows proper structure with appropriate use of
x-kubernetes-preserve-unknown-fieldsfor the operator-managed spec and status fields.test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml (2)
1-57: Note: Unusual YAML document ordering is valid but unconventional.The
apiVersionandkindfields appear at the end of each document rather than the beginning. While valid YAML (field order doesn't matter), this is atypical and may confuse readers. Since this is marked as auto-generated from Cilium upstream, modifying it may complicate future updates.
116-180: Host networking is intentional for connectivity test scenarios.The
hostNetwork: truein echo-b-host and similar deployments is required for testing host-to-pod and cross-node connectivity scenarios. The static analysis warnings about hostNetwork and security contexts can be safely ignored for these e2e test assets.test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yaml (1)
1-12: LGTM - Properly binds ClusterRole to ServiceAccount.The binding correctly references the
cilium-ciliumClusterRole and thecilium-olmServiceAccount in theciliumnamespace, consistent with other manifests in this PR.test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml (1)
68-72: Host networking and broad tolerations are appropriate for CNI operators.The
hostNetwork: trueis required for CNI operators to configure the node's network stack. The broad toleration (operator: Exists) ensures the operator can run on control plane nodes if needed. Static analysis warnings can be safely ignored for this use case.test/e2e/e2e_test.go (1)
93-93: LGTM!The new CLI flag follows the established pattern for e2e test options and integrates cleanly with the existing flag definitions.
test/e2e/util/util.go (3)
4162-4175: LGTM!The helper function follows the established gomega Eventually pattern used elsewhere in this file and provides a clean interface for waiting on deployment readiness.
4177-4190: LGTM!The helper function correctly implements the wait pattern for DaemonSet readiness, consistent with
WaitForDeploymentAvailable.
4220-4264: LGTM!The YAML decoding and Server-Side Apply logic is well-implemented with proper handling of multi-document YAMLs, namespace scoping via RESTMapper, and clear error messages.
test/e2e/util/cilium.go (4)
33-63: LGTM!Constants are well-organized with clear naming and documentation. Reading
CiliumVersionfrom environment allows flexibility in CI configurations.
65-147: LGTM!The manifest helper functions are well-structured with appropriate PodSecurity labels and SCC capabilities for Cilium. The local/remote manifest fallback pattern provides good flexibility for testing.
502-528: LGTM!Good defensive coding with sensible defaults and graceful error handling. The IPv4 detection logic via
strings.Contains(clusterNet.CIDR, ".")is simple and effective.
560-629: LGTM!The
CreateCiliumConfigfunction creates a comprehensive Cilium configuration with appropriate settings for OpenShift integration. Debug mode being enabled is reasonable for e2e test observability.
| - apiGroups: | ||
| - "" | ||
| resources: | ||
| - secrets | ||
| verbs: | ||
| - '*' | ||
| - apiGroups: | ||
| - "" | ||
| resources: | ||
| - serviceaccounts | ||
| - configmaps | ||
| - secrets | ||
| - services | ||
| verbs: | ||
| - '*' | ||
| - apiGroups: | ||
| - apps | ||
| resources: | ||
| - deployments | ||
| - daemonsets | ||
| verbs: | ||
| - '*' | ||
| - apiGroups: | ||
| - monitoring.coreos.com | ||
| resources: | ||
| - servicemonitors | ||
| verbs: | ||
| - '*' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review the use of wildcard RBAC permissions.
Multiple rules use wildcard verbs (*), granting full control over secrets, serviceaccounts, configmaps, services, deployments, daemonsets, and servicemonitors within the cilium namespace. While operators typically require elevated permissions, wildcards are overly permissive and violate the principle of least privilege.
Consider explicitly listing only the required verbs (e.g., get, list, watch, create, update, patch, delete) rather than using *. This reduces the attack surface if the operator is compromised.
Note: secrets appears in two separate rules (lines 42 and 50), which is redundant.
🤖 Prompt for AI Agents
In
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yaml
around lines 39–66, several RBAC rules use wildcard verbs ("*") and secrets is
duplicated; replace each "*" with the minimal explicit verbs the operator
actually needs (for example get, list, watch, create, update, patch, delete as
applicable per resource) and remove the duplicate secrets entry by consolidating
permissions into a single rule for secrets; ensure each apiGroup/resources block
only lists the precise verbs required for those resources to follow
least-privilege.
| guestClient := WaitForGuestClient(t, context.Background(), h.client, hostedCluster) | ||
| InstallCilium(t, context.Background(), guestClient, hostedCluster) | ||
| // wait hosted cluster ready | ||
| WaitForNReadyNodes(t, context.Background(), guestClient, opts.NodePoolReplicas, platform) | ||
| WaitForImageRollout(t, context.Background(), h.client, hostedCluster) | ||
| ValidateHostedClusterConditions(t, context.Background(), h.client, hostedCluster, true, 10*time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use h.ctx instead of context.Background() for proper cancellation propagation.
The test framework already has h.ctx which is properly configured with cancellation. Using context.Background() here means these operations won't respect test cancellation or timeouts, potentially leaving resources orphaned if the test is aborted.
🔎 Proposed fix
if opts.NodePoolReplicas == 0 {
t.Fatal("NodePool replicas must be positive for Cilium to install.")
}
- guestClient := WaitForGuestClient(t, context.Background(), h.client, hostedCluster)
- InstallCilium(t, context.Background(), guestClient, hostedCluster)
+ guestClient := WaitForGuestClient(t, h.ctx, h.client, hostedCluster)
+ InstallCilium(t, h.ctx, guestClient, hostedCluster)
// wait hosted cluster ready
- WaitForNReadyNodes(t, context.Background(), guestClient, opts.NodePoolReplicas, platform)
- WaitForImageRollout(t, context.Background(), h.client, hostedCluster)
- ValidateHostedClusterConditions(t, context.Background(), h.client, hostedCluster, true, 10*time.Minute)
+ WaitForNReadyNodes(t, h.ctx, guestClient, opts.NodePoolReplicas, platform)
+ WaitForImageRollout(t, h.ctx, h.client, hostedCluster)
+ ValidateHostedClusterConditions(t, h.ctx, h.client, hostedCluster, true, 10*time.Minute)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| guestClient := WaitForGuestClient(t, context.Background(), h.client, hostedCluster) | |
| InstallCilium(t, context.Background(), guestClient, hostedCluster) | |
| // wait hosted cluster ready | |
| WaitForNReadyNodes(t, context.Background(), guestClient, opts.NodePoolReplicas, platform) | |
| WaitForImageRollout(t, context.Background(), h.client, hostedCluster) | |
| ValidateHostedClusterConditions(t, context.Background(), h.client, hostedCluster, true, 10*time.Minute) | |
| guestClient := WaitForGuestClient(t, h.ctx, h.client, hostedCluster) | |
| InstallCilium(t, h.ctx, guestClient, hostedCluster) | |
| // wait hosted cluster ready | |
| WaitForNReadyNodes(t, h.ctx, guestClient, opts.NodePoolReplicas, platform) | |
| WaitForImageRollout(t, h.ctx, h.client, hostedCluster) | |
| ValidateHostedClusterConditions(t, h.ctx, h.client, hostedCluster, true, 10*time.Minute) |
🤖 Prompt for AI Agents
In test/e2e/util/hypershift_framework.go around lines 150 to 155, several calls
use context.Background() which bypasses the framework's cancellation; replace
each context.Background() with h.ctx (i.e., use h.ctx for WaitForGuestClient,
InstallCilium, WaitForNReadyNodes, WaitForImageRollout, and
ValidateHostedClusterConditions) so the operations inherit the test's context
and proper cancellation/timeouts.
Why: To enable e2e testing of HyperShift clusters using Cilium as the CNI provider. How: - Added `ExternalCNIProvider` flag and option to the e2e framework. - Implemented logic to install Cilium and skip default network validation when enabled. - Added utility functions `ApplyYAMLFromURL` and wait helpers for Deployments/DaemonSets. - Included Cilium-specific installation logic and network policy tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Fix all issues with AI Agents 🤖
In @test/e2e/util/util.go:
- Around line 4198-4211: The HTTP download uses the default client (http.Get)
without a timeout which can hang; replace the http.Get(pathOrURL) call with a
dedicated http.Client having a reasonable Timeout (e.g., 10s) and call
client.Get(pathOrURL) instead, then keep the existing resp.Body.Close, status
code check and io.ReadAll handling; locate the download block around the
http.Get usage (the pathOrURL/http.Get/yamlContent section) and swap in the
timeout-capable client to ensure the request fails fast on unresponsive hosts.
♻️ Duplicate comments (1)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yaml (1)
39-66: Wildcard RBAC and duplicatesecretsentry already flagged.This concern was raised in a previous review. The wildcard verbs and duplicate
secretsresource (Lines 42 and 50) should be addressed per the earlier feedback.
🧹 Nitpick comments (7)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml (1)
17-72: Consider adding a security context to restrict privilege escalation.The deployment lacks a
securityContextto explicitly disable privilege escalation. WhilehostNetwork: trueand the broad toleration are necessary for CNI operators, you can still harden the container:containers: - name: operator securityContext: allowPrivilegeEscalation: false readOnlyRootFilesystem: trueSince this is test infrastructure for e2e validation, this is optional but would align with security best practices flagged by static analysis.
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml (1)
5-17: Wildcard permissions on CRDs are broader than necessary.The wildcard verbs (
*) oncilium.io/*resources are acceptable for the Cilium operator. However, the wildcard oncustomresourcedefinitions(Lines 12-17) grants ability to create/delete any CRD in the cluster, not just Cilium's.For test infrastructure this is acceptable, but in production environments consider restricting to specific CRD names or limiting verbs to what's actually needed (e.g.,
get,list,watch,create,update,patch).test/e2e/util/util.go (2)
4162-4174: Consider strengthening the readiness check.The current check only verifies that
ReadyReplicas > 0, which may not ensure the Deployment is fully ready. If the Deployment has multiple replicas, this could return success prematurely.🔎 Proposed improvement
func WaitForDeploymentAvailable(ctx context.Context, t *testing.T, client crclient.Client, name, namespace string, timeout, interval time.Duration) { g := NewWithT(t) t.Logf("Waiting for deployment %s/%s to be ready", namespace, name) g.Eventually(func() bool { deployment := &appsv1.Deployment{} err := client.Get(ctx, crclient.ObjectKey{Name: name, Namespace: namespace}, deployment) if err != nil { t.Logf("Failed to get deployment %s/%s: %v", namespace, name, err) return false } - return deployment.Status.ReadyReplicas > 0 + // Check that all desired replicas are ready + return deployment.Status.ReadyReplicas > 0 && + deployment.Status.ReadyReplicas == deployment.Status.Replicas && + deployment.Status.ObservedGeneration >= deployment.Generation }, timeout, interval).Should(BeTrue(), fmt.Sprintf("deployment %s/%s should be ready", namespace, name)) }
4177-4189: Consider strengthening the DaemonSet readiness check.The current check only verifies that
NumberReady > 0, which doesn't ensure the DaemonSet has pods ready on all desired nodes.🔎 Proposed improvement
func WaitForDaemonSetReady(ctx context.Context, t *testing.T, client crclient.Client, name, namespace string, timeout, interval time.Duration) { g := NewWithT(t) t.Logf("Waiting for DaemonSet %s/%s to be ready", namespace, name) g.Eventually(func() bool { ds := &appsv1.DaemonSet{} err := client.Get(ctx, crclient.ObjectKey{Name: name, Namespace: namespace}, ds) if err != nil { t.Logf("Failed to get DaemonSet %s/%s: %v", namespace, name, err) return false } - return ds.Status.NumberReady > 0 + // Check that all desired pods are ready and the rollout is complete + return ds.Status.NumberReady > 0 && + ds.Status.NumberReady == ds.Status.DesiredNumberScheduled && + ds.Status.ObservedGeneration >= ds.Generation }, timeout, interval).Should(BeTrue(), fmt.Sprintf("DaemonSet %s/%s should be ready", namespace, name)) }test/e2e/util/cilium.go (3)
259-307: Complex Eventually block with nested updates - consider simplifying.The Eventually block handles the case where the Cilium operator might overwrite the CiliumConfig, but the nested update logic makes it harder to debug failures. Consider extracting this to a helper function or adding more structured logging.
This is not blocking, but could improve maintainability for debugging test failures.
470-493: Replace fixed sleep with polling for test completion.The current implementation waits a fixed 60 seconds (line 474) and then only checks if pods are still running. This approach has limitations:
- Fixed waits are brittle - tests might pass/fail based on timing
- Checking only pod phase misses container-level failures (CrashLoopBackOff, errors)
- Could result in false positives if connectivity tests are still running
Consider polling for test completion status instead (e.g., checking container termination status or using a Job-based approach) and examining container logs for test results.
556-558: Use existing pointer helper instead of custom function.The repository already uses
k8s.io/utils/ptr(imported asptrin other files). Consider usingptr.To[bool](b)instead of maintaining a customptrBoolfunction.🔎 Proposed change
-func ptrBool(b bool) *bool { - return &b -}And update the usage at line 386:
- AllowPrivilegeEscalation: ptrBool(false), + AllowPrivilegeEscalation: ptr.To(false),
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
📒 Files selected for processing (20)
test/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00000-cilium-namespace.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00001-cilium-olm-serviceaccount.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00003-cilium-olm-service.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00004-cilium-olm-leader-election-role.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00006-leader-election-rolebinding.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00010-cilium-cilium-olm-clusterrolebinding.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yamltest/e2e/assets/cilium/v1.14.5/connectivity-check.yamltest/e2e/cilium_network_policy_test.gotest/e2e/e2e_test.gotest/e2e/util/cilium.gotest/e2e/util/hypershift_framework.gotest/e2e/util/options.gotest/e2e/util/util.go
🚧 Files skipped from review as they are similar to previous changes (10)
- test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00006-leader-election-rolebinding.yaml
- test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00010-cilium-cilium-olm-clusterrolebinding.yaml
- test/e2e/e2e_test.go
- test/e2e/util/options.go
- test/e2e/util/hypershift_framework.go
- test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00001-cilium-olm-serviceaccount.yaml
- test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00004-cilium-olm-leader-election-role.yaml
- test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00003-cilium-olm-service.yaml
- test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00008-cilium-cilium-olm-clusterrole.yaml
- test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00000-cilium-namespace.yaml
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
test/e2e/cilium_network_policy_test.gotest/e2e/util/util.gotest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yamltest/e2e/util/cilium.gotest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00005-cilium-olm-role.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yamltest/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yamltest/e2e/assets/cilium/v1.14.5/connectivity-check.yaml
🧬 Code graph analysis (1)
test/e2e/util/cilium.go (2)
support/azureutil/azureutil.go (1)
IsAroHCP(217-219)test/e2e/util/util.go (3)
ApplyYAML(4194-4264)WaitForDeploymentAvailable(4163-4175)WaitForDaemonSetReady(4178-4190)
🪛 Checkov (3.2.334)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00002-cilium-olm-deployment.yaml
[medium] 1-75: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 1-75: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 1-75: Containers should not share the host network namespace
(CKV_K8S_19)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00009-cilium-cilium-clusterrole.yaml
[medium] 1-82: Minimize wildcard use in Roles and ClusterRoles
(CKV_K8S_49)
test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml
[medium] 3-58: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 3-58: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 59-115: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 59-115: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 116-180: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 116-180: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 116-180: Containers should not share the host network namespace
(CKV_K8S_19)
[medium] 181-236: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 181-236: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 237-292: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 237-292: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 293-338: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 293-338: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 339-394: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 339-394: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 395-450: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 395-450: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 451-516: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 451-516: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 517-582: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 517-582: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 583-649: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 583-649: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 583-649: Containers should not share the host network namespace
(CKV_K8S_19)
[medium] 650-716: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 650-716: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 650-716: Containers should not share the host network namespace
(CKV_K8S_19)
[medium] 717-782: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 717-782: Minimize the admission of root containers
(CKV_K8S_23)
[medium] 783-848: Containers should not run with allowPrivilegeEscalation
(CKV_K8S_20)
[medium] 783-848: Minimize the admission of root containers
(CKV_K8S_23)
🪛 golangci-lint (2.5.0)
test/e2e/util/cilium.go
[error] 228-228: : # github.com/openshift/hypershift/sync-global-pullsecret [github.com/openshift/hypershift/sync-global-pullsecret.test]
sync-global-pullsecret/sync-global-pullsecret_test.go:228:23: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:234:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:247:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:257:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:270:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:283:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:296:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:309:26: undefined: MockdbusConn
sync-global-pullsecret/sync-global-pullsecret_test.go:327:12: undefined: NewMockdbusConn
(typecheck)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Red Hat Konflux / hypershift-release-mce-211-on-pull-request
- GitHub Check: Red Hat Konflux / hypershift-operator-main-on-pull-request
- GitHub Check: Red Hat Konflux / hypershift-cli-mce-211-on-pull-request
🔇 Additional comments (7)
test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00007-cilium-olm-rolebinding.yaml (1)
1-13: LGTM!Standard RoleBinding correctly binding the
cilium-olmRole to thecilium-olmServiceAccount in theciliumnamespace.test/e2e/assets/cilium/v1.14.5/cluster-network-03-cilium-ciliumconfigs-crd.yaml (1)
1-41: LGTM!Well-structured CRD definition for
CiliumConfigwith appropriatex-kubernetes-preserve-unknown-fieldsfor flexible operator configuration and status subresource enabled.test/e2e/cilium_network_policy_test.go (2)
43-51: LGTM!Good test structure with proper resource cleanup via
defer. The nested ARO HCP check correctly differentiates between generic Azure and ARO HCP-specific requirements.
17-25: No changes needed. The comment accurately describes the test implementation.EnsureCiliumConnectivityTestResourcesinternally checks Cilium pod readiness as its first step via theCheckCiliumPodsRunningsubtest before proceeding with subsequent setup operations. The documented steps align with the actual execution flow.test/e2e/assets/cilium/v1.14.5/connectivity-check.yaml (1)
1-2: Auto-generated test manifest - static analysis findings are expected.This is a standard Cilium connectivity test suite auto-generated by Makefile. The
hostNetwork: trueconfigurations and lack of explicit security contexts are intentional for testing various networking scenarios (host-to-pod, cross-node, nodeport access). The static analysis warnings about privilege escalation and host networking are false positives in this context.test/e2e/assets/cilium/v1.14.5/cluster-network-06-cilium-00011-cilium-cilium-clusterrolebinding.yaml (1)
1-12: LGTM!Standard ClusterRoleBinding manifest that correctly binds the cilium-cilium ClusterRole to the cilium-olm ServiceAccount in the cilium namespace.
test/e2e/util/cilium.go (1)
81-114: LGTM - Appropriately permissive SCC for CNI workload.The SecurityContextConstraints are correctly configured for a CNI provider like Cilium, which requires elevated privileges (host network, privileged containers, etc.). The permissions are scoped to specific service accounts in the cilium namespace.
| if strings.HasPrefix(pathOrURL, "http://") || strings.HasPrefix(pathOrURL, "https://") { | ||
| // Download YAML content | ||
| resp, err := http.Get(pathOrURL) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to download manifest from %s: %w", pathOrURL, err) | ||
| } | ||
| defer resp.Body.Close() | ||
| if resp.StatusCode != http.StatusOK { | ||
| return fmt.Errorf("failed to download manifest from %s: HTTP %d", pathOrURL, resp.StatusCode) | ||
| } | ||
| yamlContent, err = io.ReadAll(resp.Body) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to read manifest content: %w", err) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add timeout to HTTP client for reliability.
The http.Get call at line 4200 uses the default HTTP client with no timeout, which could hang indefinitely if the server is unresponsive. This is particularly problematic in CI/CD environments or when network conditions are poor.
🔎 Proposed fix
if strings.HasPrefix(pathOrURL, "http://") || strings.HasPrefix(pathOrURL, "https://") {
// Download YAML content
- resp, err := http.Get(pathOrURL)
+ client := &http.Client{
+ Timeout: 30 * time.Second,
+ }
+ resp, err := client.Get(pathOrURL)
if err != nil {
return fmt.Errorf("failed to download manifest from %s: %w", pathOrURL, err)
}
defer resp.Body.Close()
if resp.StatusCode != http.StatusOK {
return fmt.Errorf("failed to download manifest from %s: HTTP %d", pathOrURL, resp.StatusCode)
}
yamlContent, err = io.ReadAll(resp.Body)
if err != nil {
return fmt.Errorf("failed to read manifest content: %w", err)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if strings.HasPrefix(pathOrURL, "http://") || strings.HasPrefix(pathOrURL, "https://") { | |
| // Download YAML content | |
| resp, err := http.Get(pathOrURL) | |
| if err != nil { | |
| return fmt.Errorf("failed to download manifest from %s: %w", pathOrURL, err) | |
| } | |
| defer resp.Body.Close() | |
| if resp.StatusCode != http.StatusOK { | |
| return fmt.Errorf("failed to download manifest from %s: HTTP %d", pathOrURL, resp.StatusCode) | |
| } | |
| yamlContent, err = io.ReadAll(resp.Body) | |
| if err != nil { | |
| return fmt.Errorf("failed to read manifest content: %w", err) | |
| } | |
| if strings.HasPrefix(pathOrURL, "http://") || strings.HasPrefix(pathOrURL, "https://") { | |
| // Download YAML content | |
| client := &http.Client{ | |
| Timeout: 30 * time.Second, | |
| } | |
| resp, err := client.Get(pathOrURL) | |
| if err != nil { | |
| return fmt.Errorf("failed to download manifest from %s: %w", pathOrURL, err) | |
| } | |
| defer resp.Body.Close() | |
| if resp.StatusCode != http.StatusOK { | |
| return fmt.Errorf("failed to download manifest from %s: HTTP %d", pathOrURL, resp.StatusCode) | |
| } | |
| yamlContent, err = io.ReadAll(resp.Body) | |
| if err != nil { | |
| return fmt.Errorf("failed to read manifest content: %w", err) | |
| } |
🤖 Prompt for AI Agents
In @test/e2e/util/util.go around lines 4198-4211, The HTTP download uses the
default client (http.Get) without a timeout which can hang; replace the
http.Get(pathOrURL) call with a dedicated http.Client having a reasonable
Timeout (e.g., 10s) and call client.Get(pathOrURL) instead, then keep the
existing resp.Body.Close, status code check and io.ReadAll handling; locate the
download block around the http.Get usage (the pathOrURL/http.Get/yamlContent
section) and swap in the timeout-capable client to ensure the request fails fast
on unresponsive hosts.
|
@wewang58: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@jparrill @mgencur @heliubj18 Submitted a new pr to make it easy to review, it's almost the same with the old pr, just move some templates to assets directory, can you review again, thanks. |
|
The AKS test fails with: (waits for OVN but this is Cilium) |
mgencur
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @wewang58 , a few comments here and there. Thanks
| t.Skip("Skipping test because it requires Azure platform") | ||
| } | ||
|
|
||
| if globalOpts.ExternalCNIProvider != "cilium" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you extract "cilium" as const and re-use in the other places? It's used a few times in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mgencur Thanks reviewing, i will update others and test it.
| flag.BoolVar(&globalOpts.HOInstallationOptions.EnableCIDebugOutput, "e2e.ho-enable-ci-debug-output", false, "Install the HyperShift Operator with extra CI debug output enabled. This is a HyperShift Operator installation option") | ||
| flag.StringVar(&globalOpts.HOInstallationOptions.PlatformMonitoring, "e2e.platform-monitoring", "All", "The option for enabling platform cluster monitoring when installing the HyperShift Operator. Valid values are: None, OperatorOnly, All. This is a HyperShift Operator installation option") | ||
| flag.BoolVar(&globalOpts.RunUpgradeTest, "upgrade.run-tests", false, "Run HyperShift Operator upgrade test") | ||
| flag.StringVar(&globalOpts.ExternalCNIProvider, "e2e.external-cni-provider", "", "the option supported third cni provider: cilium, calico") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to mention "calico" here? Maybe something like this would be better: The option supports the following CNI providers: cilium
...and later we could add more if necessary.
| CiliumDefaultTimeout = 10 * time.Minute | ||
| CiliumLongTimeout = 20 * time.Minute | ||
| CiliumShortTimeout = 2 * time.Minute | ||
| CiliumDefaultPollInterval = 10 * time.Second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other PR, @jparrill was requesting those variables to be package private (AFAIK). #7077 (comment)
I'm just bring this here to speed up the loop.
| if _, err := os.Stat(localPath); err == nil { | ||
| manifests = append(manifests, localPath) | ||
| } else { | ||
| manifests = append(manifests, fmt.Sprintf("%s/%s", remoteBase, f)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose you included the cilium yaml files in this PR to prevent downloading the files every time (e.g. for disconnected envs). So, we have the files locally now, so why this backup for downloading the files?
It is unlikely that some files would be accidentally removed from the repo.
| func (h *hypershiftTest) before(hostedCluster *hyperv1.HostedCluster, opts *PlatformAgnosticOptions, platform hyperv1.PlatformType) { | ||
| h.Run("ValidateHostedCluster", func(t *testing.T) { | ||
| if platform != hyperv1.NonePlatform { | ||
| if platform != hyperv1.NonePlatform && hostedCluster.Spec.Networking.NetworkType != hyperv1.Other { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a related comment on the PR that will trigger the test suite: https://github.com/openshift/release/pull/71383/files#r2672253127 . PTAL. I think it's important to set --network-type=Other there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean if EXTERNAL_CNI_PROVIDER is not null, will add --e2e.external-cni-provider and --network-type=Other in https://github.com/openshift/release/pull/71383/files#diff-36e084ed3de5a17840678199ccb286a3ff6b2e47b9b6ebbbf011a26e43932f5cL106 right? i will update that pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. This is what I mean.
I have ci pr: openshift/release#71383 is dependent on the pr. |
What this PR does / why we need it:
This PR implements cilium installation and check basic network policy scenarios test in aks env. close previous pr to make it clear.
Which issue(s) this PR fixes:
Fixes https://issues.redhat.com/browse/CNTRLPLANE-1137
Special notes for your reviewer:
ExternalCNIProviderflag and option to the e2e framework.ApplyYAMLFromURLand wait helpers for Deployments/DaemonSets.Checklist: