Skip to content

Automated cherry pick of #1399: cloudbuild: bump gcb-docker-gcloud to v20260205-38cfa9523f #1419: cloudbuild: bump timeout to 3600s #1421: cloudbuild: upgrade to E2_HIGHCPU_32 to fix session timeout#1426

Merged
k8s-ci-robot merged 3 commits into
kubernetes:release-1.31from
Ganiredi:automated-cherry-pick-of-#1399-#1419-#1421-upstream-release-1.31
May 7, 2026
Merged

Conversation

@Ganiredi
Copy link
Copy Markdown
Contributor

@Ganiredi Ganiredi commented May 6, 2026

Cherry pick of #1399 #1419 #1421 on release-1.31.

#1399: cloudbuild: bump gcb-docker-gcloud to v20260205-38cfa9523f
#1419: cloudbuild: bump timeout to 3600s
#1421: cloudbuild: upgrade to E2_HIGHCPU_32 to fix session timeout

For details on the cherry pick process, see the cherry pick requests page.


Ganiredi added 3 commits May 6, 2026 15:31
The cloud-provider-aws-push-images postsubmit has been failing across
all branches (master and release-*) since early April 2026 because
cloudbuild.yaml pins
gcr.io/k8s-staging-test-infra/gcb-docker-gcloud:v20221214-1b4dd4d69a,
and that tag has been garbage-collected out of the staging registry.
Cloud Build retries the pull 10 times and fails with 'manifest unknown:
Failed to fetch "v20221214-1b4dd4d69a"'.

Example failed run (v1.36.0 tag):
https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051449456091467776

Bumping to v20260205-38cfa9523f (digest
sha256:ff388e0dc16351e96f8464e2e185b74a7578a5ccb7a112cf3393468e59e6e2d2),
currently the newest tag in gcr.io/k8s-staging-test-infra/gcb-docker-gcloud
and aliased to 'latest'. This image still provides /buildx-entrypoint used
by the build step.

Signed-off-by: Ganesh Putta <ganiredi@amazon.com>
The cloud-provider-aws-push-images postsubmit has been running up
against the 1200s (20 min) Cloud Build timeout:

- The new gcb-docker-gcloud image (pinned in kubernetes#1399) is larger and
  pushes/pulls take slightly longer, and GCB's shared pool has been
  slower overall, so step 1 (multi-arch buildx) routinely now uses
  16-19 of the 20 available minutes.
- Step 2 (cloudbuild-artifacts) then has almost no budget left and
  is killed mid-`hack/install-gsutil.sh`, which means the
  ecr-credential-provider binaries never reach
  gs://k8s-staging-provider-aws/releases/.

Observed failures after kubernetes#1399 landed:
- release-1.36:
  https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051718180123971584
  (step 1 ok, step 2 TIMEOUT installing gsutil)
- master:
  https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051514822092132352
  (step 1 buildkit session deadline during registry push)

Cross-check: the last successful binary upload to the staging
releases bucket was v1.35.1-5-g7dac1f6 on 2026-03-27 — step 2 has
effectively been timing out for over a month, independent of the
image pin issue.

Bumping the overall build timeout to 3600s (60 min) gives headroom
for GCB variability, keeps step 1 multi-arch pushes reliable, and
leaves enough budget for the existing install-gsutil + upload flow.
A follow-up PR can skip the SDK reinstall by switching the step 2
base image.

Signed-off-by: Ganesh Putta <ganiredi@amazon.com>
Multi-arch builds (linux/amd64 + linux/arm64) take ~12 min on N1_HIGHCPU_8,
causing the BuildKit session to expire before the push phase completes:

  error: no active session: context deadline exceeded

Other K8s repos with large multi-arch binaries (aws-ebs-csi-driver,
cloud-provider-gcp) use E2_HIGHCPU_32 which completes builds in ~3-5 min,
well within the session window.
@k8s-ci-robot k8s-ci-robot added the do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. label May 6, 2026
@k8s-ci-robot k8s-ci-robot requested review from hakman and olemarkus May 6, 2026 20:31
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 6, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

This issue is currently awaiting triage.

If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 6, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @Ganiredi. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label May 6, 2026
@kmala
Copy link
Copy Markdown
Member

kmala commented May 6, 2026

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 6, 2026
@kmala
Copy link
Copy Markdown
Member

kmala commented May 7, 2026

/lgtm
/approve
/release-note-none

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels May 7, 2026
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 7, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kmala

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 7, 2026
@k8s-ci-robot k8s-ci-robot merged commit fe1a3bb into kubernetes:release-1.31 May 7, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note-none Denotes a PR that doesn't merit a release note. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants