Skip to content

NO-JIRA: Allow multiple attempts in egress firewall test #29615

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ldoktor
Copy link

@ldoktor ldoktor commented Mar 21, 2025

this test is failing when using kata-containers, which might be related to longer startup times of kata-containers:

curl: (28) Connection timed out after 1001 milliseconds

let's use the "--retry" feature of curl. This should not affect the successful tests as they should return immediately, while it might prolong the failing tests from 3s to 30s. With kata we need about 6-12s so 30s should be safe for us.

@openshift-ci openshift-ci bot requested review from knobunc and trozet March 21, 2025 08:00
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 25, 2025
@ldoktor
Copy link
Author

ldoktor commented Mar 26, 2025

@knobunc @trozet hello folks, this is my first contribution in openshift/origin, should I keep rebasing this PR or should I wait for a review first?

@ldoktor
Copy link
Author

ldoktor commented Apr 8, 2025

CC: @neisw @bertinatto could you please take a look at this? Should I rebase or simply wait for a review?

@neisw
Copy link
Contributor

neisw commented Apr 8, 2025

Hi @ldoktor, go ahead and rebase. Typically you would get a review from the team responsible for the test (looks like sdn team). Also if you know of a job that typically shows this failure it would be good to run that job for validation along with the regular presubmits.

this test is failing when using kata-containers, which might be related
to longer startup times of kata-containers:

    curl: (28) Connection timed out after 1001 milliseconds

let's use the "--retry" feature of curl. This should not affect the
successful tests as they should return immediately, while it might
prolong the failing tests from 3s to 30s. With kata we need about 6-12s
so 30s should be safe for us.

Signed-off-by: Lukáš Doktor <[email protected]>
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 9, 2025
@ldoktor
Copy link
Author

ldoktor commented Apr 9, 2025

Hi @ldoktor, go ahead and rebase. Typically you would get a review from the team responsible for the test (looks like sdn team). Also if you know of a job that typically shows this failure it would be good to run that job for validation along with the regular presubmits.

Thank you, I wasn't sure. It's rebased, the failed job log is here: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-kata-containers-kata-containers-main-e2e-tests/1903703902650372096 and I manually tested my code with extra debug, usually it's 3-4 retries before it's ready and the test is passing with this exact commit as well.

@neisw
Copy link
Contributor

neisw commented Apr 9, 2025

/payload-job periodic-ci-kata-containers-kata-containers-main-e2e-tests

Copy link
Contributor

openshift-ci bot commented Apr 9, 2025

@neisw: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

@neisw
Copy link
Contributor

neisw commented Apr 9, 2025

No luck running /payload-job periodic-ci-kata-containers-kata-containers-main-e2e-tests This is a small enough change and the presubmits look fine I can go ahead and tag it. Do you have a jira for this work?

@ldoktor
Copy link
Author

ldoktor commented Apr 10, 2025

No luck running /payload-job periodic-ci-kata-containers-kata-containers-main-e2e-tests This is a small enough change and the presubmits look fine I can go ahead and tag it. Do you have a jira for this work?

This is related to upstream testing so we don't have any jira for it.

@neisw
Copy link
Contributor

neisw commented Apr 11, 2025

/lgtm

you probably want to retitle with NO-JIRA: then

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 11, 2025
Copy link
Contributor

openshift-ci bot commented Apr 11, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ldoktor, neisw

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 11, 2025
@ldoktor ldoktor changed the title Allow multiple attempts in egress firewall test NO-JIRA: Allow multiple attempts in egress firewall test Apr 14, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 14, 2025
@openshift-ci-robot
Copy link

@ldoktor: This pull request explicitly references no jira issue.

In response to this:

this test is failing when using kata-containers, which might be related to longer startup times of kata-containers:

curl: (28) Connection timed out after 1001 milliseconds

let's use the "--retry" feature of curl. This should not affect the successful tests as they should return immediately, while it might prolong the failing tests from 3s to 30s. With kata we need about 6-12s so 30s should be safe for us.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD ed54e26 and 2 for PR HEAD c0e7f89 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 787ed13 and 2 for PR HEAD c0e7f89 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 0c7519e and 1 for PR HEAD c0e7f89 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 0c7519e and 2 for PR HEAD c0e7f89 in total

2 similar comments
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 0c7519e and 2 for PR HEAD c0e7f89 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 0c7519e and 2 for PR HEAD c0e7f89 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f5a8115 and 2 for PR HEAD c0e7f89 in total

1 similar comment
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f5a8115 and 2 for PR HEAD c0e7f89 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f5a8115 and 2 for PR HEAD c0e7f89 in total

1 similar comment
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD f5a8115 and 2 for PR HEAD c0e7f89 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 11058f6 and 1 for PR HEAD c0e7f89 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 11058f6 and 2 for PR HEAD c0e7f89 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD dc67a3a and 1 for PR HEAD c0e7f89 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD bff0360 and 0 for PR HEAD c0e7f89 in total

@openshift-ci-robot
Copy link

/hold

Revision c0e7f89 was retested 3 times: holding

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 21, 2025
@ldoktor
Copy link
Author

ldoktor commented Apr 22, 2025

Hello folks, the only required test https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/29615/pull-ci-openshift-origin-main-e2e-aws-ovn-serial/1914323994480218112 failed with

: [sig-network][Feature:EgressIP][apigroup:operator.openshift.io] [external-targets][apigroup:user.openshift.io][apigroup:security.openshift.io] pods should have the assigned EgressIPs and EgressIPs can be updated [Skipped:Network/OpenShiftSDN] [Serial] [Suite:openshift/conformance/serial] expand_less 	2m22s
{  fail [github.com/openshift/origin/test/extended/networking/egressip.go:695]: Timed out after 120.000s.
Expected
    <bool>: false
to be true
Ginkgo exit error 1: exit with code 1}

which is a different test than the one I'm touching. How should I proceed to get this little improvement merged?

Copy link
Contributor

openshift-ci bot commented May 15, 2025

@ldoktor: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade-rollback c0e7f89 link false /test 4.12-upgrade-from-stable-4.11-e2e-aws-ovn-upgrade-rollback
ci/prow/e2e-gcp-ovn-etcd-scaling c0e7f89 link false /test e2e-gcp-ovn-etcd-scaling
ci/prow/e2e-metal-ipi-ovn-dualstack c0e7f89 link false /test e2e-metal-ipi-ovn-dualstack
ci/prow/okd-scos-e2e-aws-ovn c0e7f89 link false /test okd-scos-e2e-aws-ovn
ci/prow/e2e-aws-disruptive c0e7f89 link false /test e2e-aws-disruptive
ci/prow/e2e-aws-proxy c0e7f89 link false /test e2e-aws-proxy
ci/prow/e2e-aws-ovn-kube-apiserver-rollout c0e7f89 link false /test e2e-aws-ovn-kube-apiserver-rollout
ci/prow/e2e-aws-ovn-cgroupsv2 c0e7f89 link false /test e2e-aws-ovn-cgroupsv2
ci/prow/e2e-azure-ovn-etcd-scaling c0e7f89 link false /test e2e-azure-ovn-etcd-scaling
ci/prow/e2e-gcp-disruptive c0e7f89 link false /test e2e-gcp-disruptive
ci/prow/e2e-aws c0e7f89 link false /test e2e-aws
ci/prow/e2e-metal-ipi-ovn-dualstack-bgp-techpreview c0e7f89 link false /test e2e-metal-ipi-ovn-dualstack-bgp-techpreview
ci/prow/e2e-metal-ipi-ovn-dualstack-local-gateway c0e7f89 link false /test e2e-metal-ipi-ovn-dualstack-local-gateway
ci/prow/okd-e2e-gcp c0e7f89 link false /test okd-e2e-gcp
ci/prow/e2e-vsphere-ovn-etcd-scaling c0e7f89 link false /test e2e-vsphere-ovn-etcd-scaling
ci/prow/e2e-aws-ovn c0e7f89 link false /test e2e-aws-ovn
ci/prow/e2e-openstack-serial c0e7f89 link false /test e2e-openstack-serial
ci/prow/e2e-openstack-ovn c0e7f89 link false /test e2e-openstack-ovn
ci/prow/e2e-metal-ipi-ovn c0e7f89 link false /test e2e-metal-ipi-ovn
ci/prow/e2e-metal-ipi-serial c0e7f89 link false /test e2e-metal-ipi-serial
ci/prow/e2e-vsphere-ovn-dualstack-primaryv6 c0e7f89 link false /test e2e-vsphere-ovn-dualstack-primaryv6
ci/prow/e2e-gcp-fips-serial c0e7f89 link false /test e2e-gcp-fips-serial
ci/prow/e2e-aws-ovn-single-node-upgrade c0e7f89 link false /test e2e-aws-ovn-single-node-upgrade
ci/prow/e2e-azure-ovn-upgrade c0e7f89 link false /test e2e-azure-ovn-upgrade
ci/prow/e2e-aws-ovn-single-node-serial c0e7f89 link false /test e2e-aws-ovn-single-node-serial
ci/prow/e2e-aws-ovn-etcd-scaling c0e7f89 link false /test e2e-aws-ovn-etcd-scaling
ci/prow/e2e-aws-ovn-serial c0e7f89 link true /test e2e-aws-ovn-serial
ci/prow/e2e-aws-ovn-serial-2of2 c0e7f89 link true /test e2e-aws-ovn-serial-2of2
ci/prow/e2e-aws-ovn-serial-publicnet c0e7f89 link true /test e2e-aws-ovn-serial-publicnet

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link

openshift-trt bot commented May 15, 2025

Job Failure Risk Analysis for sha: c0e7f89

Job Name Failure Risk
pull-ci-openshift-origin-main-e2e-aws-disruptive Medium
[sig-node] static pods should start after being created
Potential external regression detected for High Risk Test analysis
---
[bz-Etcd] clusteroperator/etcd should not change condition/Available
Potential external regression detected for High Risk Test analysis
pull-ci-openshift-origin-main-e2e-aws-ovn-serial-2of2 IncompleteTests
Tests for this run (24) are below the historical average (1428): IncompleteTests (not enough tests ran to make a reasonable risk analysis; this could be due to infra, installation, or upgrade problems)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants