Try to remove a crashing pod to test cpu situation in tp jobs#31200
Try to remove a crashing pod to test cpu situation in tp jobs#31200xueqzhan wants to merge 3 commits into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
WalkthroughThis PR refactors the audit-logs must-gather test by consolidating validation logic, adding new assertions for sequential apiserver execution and OAuth audit content, restructuring error handling in the retry loop, and simplifying must-gather invocations in two other tests. ChangesAudit Logs Test Refactoring
🎯 3 (Moderate) | ⏱️ ~20 minutes Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (1 error, 5 warnings)
✅ Passed checks (6 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: xueqzhan The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/payload-job periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-rhcos9-techpreview |
|
@xueqzhan: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/57d1de60-5473-11f1-8018-90fc888065f7-0 |
|
/payload-job periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-upgrade-fips-rhcos9-techpreview |
|
@xueqzhan: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/71dac150-5473-11f1-8c56-bf533e7f0877-0 |
|
Scheduling required tests: |
|
/payload-aggregate periodic-ci-openshift-release-main-ci-5.0-e2e-aws-ovn-rhcos9-techpreview 5 |
|
@xueqzhan: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/91671ef0-5474-11f1-98a8-f4e80f2c9b13-0 |
|
/payload-aggregate periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-upgrade-fips-rhcos9-techpreview 4 |
|
@xueqzhan: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/ab855ea0-5474-11f1-8777-eccceea9d0ed-0 |
|
/payload-aggregate periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-upgrade-fips-rhcos9-techpreview 6 |
|
@xueqzhan: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/a02bb390-54ad-11f1-84f9-af2b2ba73bbf-0 |
|
/payload-aggregate periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-ovn-upgrade-fips-rhcos9-techpreview 6 |
|
@xueqzhan: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/a6785b50-5556-11f1-8ae8-0bd1020826e7-0 |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/extended/cli/mustgather.go`:
- Around line 294-295: The current code calls result.Flakef unconditionally when
lock.log exists, causing false flakes; update the logic around the
result.Flakef(apiserver, lockLog) call so it only reports a flake when the
captured lockLog actually contains content (e.g., check len(lockLog) > 0 or
strings.TrimSpace(string(lockLog)) != "" before calling result.Flakef). Locate
the call to result.Flakef and the lockLog variable in mustgather.go and wrap the
Flakef invocation in that conditional, returning nil (or continuing) when
lockLog is empty.
- Line 299: Replace the incorrect Gomega assertion
o.Expect(seen.HasAll(expectedAuditSubDirs...), o.BeTrue()) with the proper
matcher call by invoking .To(o.BeTrue()) on the Expect result (i.e.
o.Expect(seen.HasAll(expectedAuditSubDirs...)).To(o.BeTrue())), and update the
lock log handling so result.Flakef(...) is only called when lockLog actually
indicates the “lock was still held” condition (gate the flake report on a
specific substring/marker in lockLog such as the known "lock was still held"
message or other reliable indicator rather than any presence of lock.log).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: f74e07b4-39e8-4915-b499-6b698bcc1dfc
📒 Files selected for processing (1)
test/extended/cli/mustgather.go
| result.Flakef("kubelet launched %s without waiting for the old process to terminate (lock was still hold): \n\n%s", apiserver, string(lockLog)) | ||
| return nil |
There was a problem hiding this comment.
Only report flake when lock.log is non-empty.
At Line 294, result.Flakef(...) runs unconditionally when lock.log exists, even if there’s no lock contention evidence. This can generate false flaky signals.
Suggested fix
lockLog, err := os.ReadFile(path)
o.Expect(err).NotTo(o.HaveOccurred())
- result.Flakef("kubelet launched %s without waiting for the old process to terminate (lock was still hold): \n\n%s", apiserver, string(lockLog))
+ if strings.TrimSpace(string(lockLog)) != "" {
+ result.Flakef("kubelet launched %s without waiting for the old process to terminate (lock was still hold): \n\n%s", apiserver, string(lockLog))
+ }
return nil📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| result.Flakef("kubelet launched %s without waiting for the old process to terminate (lock was still hold): \n\n%s", apiserver, string(lockLog)) | |
| return nil | |
| lockLog, err := os.ReadFile(path) | |
| o.Expect(err).NotTo(o.HaveOccurred()) | |
| if strings.TrimSpace(string(lockLog)) != "" { | |
| result.Flakef("kubelet launched %s without waiting for the old process to terminate (lock was still hold): \n\n%s", apiserver, string(lockLog)) | |
| } | |
| return nil |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended/cli/mustgather.go` around lines 294 - 295, The current code
calls result.Flakef unconditionally when lock.log exists, causing false flakes;
update the logic around the result.Flakef(apiserver, lockLog) call so it only
reports a flake when the captured lockLog actually contains content (e.g., check
len(lockLog) > 0 or strings.TrimSpace(string(lockLog)) != "" before calling
result.Flakef). Locate the call to result.Flakef and the lockLog variable in
mustgather.go and wrap the Flakef invocation in that conditional, returning nil
(or continuing) when lockLog is empty.
| }) | ||
| o.Expect(err).NotTo(o.HaveOccurred()) | ||
| } | ||
| o.Expect(seen.HasAll(expectedAuditSubDirs...), o.BeTrue()) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify suspicious Gomega usage where a matcher may be passed as Expect's optional description argument.
rg -nP 'o\.Expect\([^)]*,\s*o\.[A-Za-z_][A-Za-z0-9_]*\([^)]*\)\s*\)' test/extended/cli/mustgather.goRepository: openshift/origin
Length of output: 42
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="test/extended/cli/mustgather.go"
echo "== Lines 280-320 =="
nl -ba "$file" | sed -n '280,320p'
echo
echo "== Lines 250-310 (broader context) =="
nl -ba "$file" | sed -n '240,310'
echo
echo "== Search for result.Flakef =="
rg -n "result\.Flakef" "$file"
echo
echo "== Context around lock.log =="
rg -n "lock\.log" "$file" -n
rg -n -C 3 "lock\.log" "$file"Repository: openshift/origin
Length of output: 141
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="test/extended/cli/mustgather.go"
echo "== Snippet around line 299 =="
sed -n '290,310p' "$file"
echo
echo "== Lines with result.Flakef =="
rg -n "result\.Flakef" "$file" || true
echo
echo "== Lines with lock.log =="
rg -n "lock\.log" "$file" || true
echo
echo "== Context around lock.log (if present) =="
# Print a small window around every lock.log occurrence
awk '
{ if ($0 ~ /lock\.log/) { start=(NR-8>1?NR-8:1); end=NR+12; for(i=start;i<=end;i++){ getline l < ARGV[1]; } } }
' "$file" >/dev/null 2>&1 || true
# Simpler: use rg line numbers and sed around them
while IFS=: read -r ln _; do
echo "-- lock.log at line $ln --"
sed -n "$((ln-8)),$((ln+12))p" "$file"
echo
done < <(rg -n "lock\.log" "$file" || true)Repository: openshift/origin
Length of output: 1317
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="test/extended/cli/mustgather.go"
echo "== Lines 250-305 =="
sed -n '250,305p' "$file"
echo
echo "== Lines 305-340 =="
sed -n '305,340p' "$file"
echo
echo "== Find any Expect(actual, o.<matcher>()) patterns =="
rg -n "o\.Expect\([^,\)]*,\s*o\.[A-Za-z_][A-Za-z0-9_]*\([^)]*\)\s*\)" "$file" || true
echo
echo "== Find specifically BeTrue second-arg usage =="
rg -n "o\.Expect\([^,\)]*,\s*o\.BeTrue\(\)\s*\)" "$file" || trueRepository: openshift/origin
Length of output: 3539
🌐 Web query:
Go Gomega: what does Expect(actual, extra ...interface{}) do when the extra argument is a matcher like BeTrue()? Is it treated as an assertion or just a failure message/description?
💡 Result:
In Gomega, when you call Expect(actual, extra...interface{}), the extra arguments are not used as matchers [1]. Instead, they are treated as an optional description or annotation for the assertion [2][3]. If an assertion fails, these extra arguments are used to annotate the failure message [1][2]. You can pass: - A format string (optionally with additional arguments) to be processed by fmt.Sprintf [1][2][3]. - A function of the type func string, which Gomega will lazily evaluate only if the assertion fails to produce a custom failure message [1][2]. The actual matcher (e.g., BeTrue) must be passed to the subsequent method chain, such as.To(BeTrue) or.Should(BeTrue) [2][3]. Passing a matcher as an extra argument in the Expect call will not perform an assertion against that matcher; it will simply be treated as an object to be included in the failure annotation [1]. In summary, the Expect function handles the "actual" value, while the subsequent.To or.Should method handles the matcher [2][3]. The extra arguments in Expect are strictly for descriptive purposes [1][2].
Citations:
- 1: https://github.com/onsi/gomega/blob/master/gomega_dsl.go
- 2: https://onsi.github.io/gomega/
- 3: https://edge.googlesource.com/third_party/onsi/gomega.git/+/ab719d9e32b3d56d8310ad08a2e1bd939b7d7817/index.md
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="test/extended/cli/mustgather.go"
echo "== Search for lock.log writer/creation in this file =="
rg -n "lock\.log|lockLog|still hold|kubelet launched" "$file"
echo
echo "== Broader context around result.Flakef call =="
sed -n '265,305p' "$file"
echo
echo "== Context earlier/later around lock.log handling =="
# Print a larger window around every mention of lock.log
rg -n "lock\.log" "$file" | cut -d: -f1 | sort -n | while read -r ln; do
echo "-- around lock.log at line $ln --"
sed -n "$((ln-25)),$((ln+25))p" "$file"
echo
doneRepository: openshift/origin
Length of output: 7474
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "Flakef\(" -S . | head -n 50Repository: openshift/origin
Length of output: 3439
🏁 Script executed:
#!/bin/bash
set -euo pipefail
file="pkg/test/ginkgo/result/result.go"
echo "== File $file =="
sed -n '1,120p' "$file"Repository: openshift/origin
Length of output: 1033
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== Search for lock.log producer/meaning =="
rg -n "lock\.log" -S . | head -n 50
echo
echo "== Search for phrase \"still hold\" =="
rg -n "still hold" -S . | head -n 50
echo
echo "== Search for \"kubelet launched\" phrase =="
rg -n "kubelet launched .*without waiting" -S . | head -n 50Repository: openshift/origin
Length of output: 1831
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "lock\.log|fileName == \"lock\.log\"|lock\.log check|lock was still hold|kubelet launched" -S . \
--glob '!*vendor/*'Repository: openshift/origin
Length of output: 504
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "LastFlake\(" -S . --glob '!*vendor/*'Repository: openshift/origin
Length of output: 132
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "LastFlake" -S . --glob '!*vendor/*'Repository: openshift/origin
Length of output: 132
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "flakeGinkgo|LastFlake\(|LastFlake" -S . --glob '!*vendor/*'Repository: openshift/origin
Length of output: 387
🏁 Script executed:
#!/bin/bash
set -euo pipefail
rg -n "flakeGinkgo|Flakef\(|LastFlake\(" pkg/test/ginkgo -S
ls -la pkg/test/ginkgo/result
sed -n '1,200p' pkg/test/ginkgo/result/result.goRepository: openshift/origin
Length of output: 1562
Fix Gomega assertion at test/extended/cli/mustgather.go:299
o.Expect(seen.HasAll(expectedAuditSubDirs...), o.BeTrue()) passes o.BeTrue() as an extra argument to Expect, which Gomega treats as a description (not a matcher), so the truth check doesn’t run. Use .To(o.BeTrue()) instead.
Suggested fix
- o.Expect(seen.HasAll(expectedAuditSubDirs...), o.BeTrue())
+ o.Expect(seen.HasAll(expectedAuditSubDirs...)).To(o.BeTrue())result.Flakef(...)is triggered whenever alock.logfile is encountered, without checking whetherlockLogcontents actually indicate the “lock was still hold” condition—consider gating on specific content/markers inlockLogto avoid spurious flake records.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| o.Expect(seen.HasAll(expectedAuditSubDirs...), o.BeTrue()) | |
| o.Expect(seen.HasAll(expectedAuditSubDirs...)).To(o.BeTrue()) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/extended/cli/mustgather.go` at line 299, Replace the incorrect Gomega
assertion o.Expect(seen.HasAll(expectedAuditSubDirs...), o.BeTrue()) with the
proper matcher call by invoking .To(o.BeTrue()) on the Expect result (i.e.
o.Expect(seen.HasAll(expectedAuditSubDirs...)).To(o.BeTrue())), and update the
lock log handling so result.Flakef(...) is only called when lockLog actually
indicates the “lock was still held” condition (gate the flake report on a
specific substring/marker in lockLog such as the known "lock was still held"
message or other reliable indicator rather than any presence of lock.log).
|
Scheduling required tests: |
Summary by CodeRabbit