cloudbuild: bump gcb-docker-gcloud to v20260205-38cfa9523f#1399
Conversation
The cloud-provider-aws-push-images postsubmit has been failing across all branches (master and release-*) since early April 2026 because cloudbuild.yaml pins gcr.io/k8s-staging-test-infra/gcb-docker-gcloud:v20221214-1b4dd4d69a, and that tag has been garbage-collected out of the staging registry. Cloud Build retries the pull 10 times and fails with 'manifest unknown: Failed to fetch "v20221214-1b4dd4d69a"'. Example failed run (v1.36.0 tag): https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051449456091467776 Bumping to v20260205-38cfa9523f (digest sha256:ff388e0dc16351e96f8464e2e185b74a7578a5ccb7a112cf3393468e59e6e2d2), currently the newest tag in gcr.io/k8s-staging-test-infra/gcb-docker-gcloud and aliased to 'latest'. This image still provides /buildx-entrypoint used by the build step. Signed-off-by: Ganesh Putta <ganiredi@amazon.com>
|
This issue is currently awaiting triage. If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Hi @Ganiredi. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Tip We noticed you've done this a few times! Consider joining the org to skip this step and gain Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/ok-to-test |
|
/release-note-none |
|
/retest |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kmala The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…-upstream-release-1.36 Automated cherry pick of #1399: cloudbuild: bump gcb-docker-gcloud to v20260205-38cfa9523f
The cloud-provider-aws-push-images postsubmit has been running up against the 1200s (20 min) Cloud Build timeout: - The new gcb-docker-gcloud image (pinned in #1399) is larger and pushes/pulls take slightly longer, and GCB's shared pool has been slower overall, so step 1 (multi-arch buildx) routinely now uses 16-19 of the 20 available minutes. - Step 2 (cloudbuild-artifacts) then has almost no budget left and is killed mid-`hack/install-gsutil.sh`, which means the ecr-credential-provider binaries never reach gs://k8s-staging-provider-aws/releases/. Observed failures after #1399 landed: - release-1.36: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051718180123971584 (step 1 ok, step 2 TIMEOUT installing gsutil) - master: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051514822092132352 (step 1 buildkit session deadline during registry push) Cross-check: the last successful binary upload to the staging releases bucket was v1.35.1-5-g7dac1f6 on 2026-03-27 — step 2 has effectively been timing out for over a month, independent of the image pin issue. Bumping the overall build timeout to 3600s (60 min) gives headroom for GCB variability, keeps step 1 multi-arch pushes reliable, and leaves enough budget for the existing install-gsutil + upload flow. A follow-up PR can skip the SDK reinstall by switching the step 2 base image. Signed-off-by: Ganesh Putta <ganiredi@amazon.com>
The cloud-provider-aws-push-images postsubmit has been running up against the 1200s (20 min) Cloud Build timeout: - The new gcb-docker-gcloud image (pinned in #1399) is larger and pushes/pulls take slightly longer, and GCB's shared pool has been slower overall, so step 1 (multi-arch buildx) routinely now uses 16-19 of the 20 available minutes. - Step 2 (cloudbuild-artifacts) then has almost no budget left and is killed mid-`hack/install-gsutil.sh`, which means the ecr-credential-provider binaries never reach gs://k8s-staging-provider-aws/releases/. Observed failures after #1399 landed: - release-1.36: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051718180123971584 (step 1 ok, step 2 TIMEOUT installing gsutil) - master: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051514822092132352 (step 1 buildkit session deadline during registry push) Cross-check: the last successful binary upload to the staging releases bucket was v1.35.1-5-g7dac1f6 on 2026-03-27 — step 2 has effectively been timing out for over a month, independent of the image pin issue. Bumping the overall build timeout to 3600s (60 min) gives headroom for GCB variability, keeps step 1 multi-arch pushes reliable, and leaves enough budget for the existing install-gsutil + upload flow. A follow-up PR can skip the SDK reinstall by switching the step 2 base image. Signed-off-by: Ganesh Putta <ganiredi@amazon.com>
The cloud-provider-aws-push-images postsubmit has been running up against the 1200s (20 min) Cloud Build timeout: - The new gcb-docker-gcloud image (pinned in #1399) is larger and pushes/pulls take slightly longer, and GCB's shared pool has been slower overall, so step 1 (multi-arch buildx) routinely now uses 16-19 of the 20 available minutes. - Step 2 (cloudbuild-artifacts) then has almost no budget left and is killed mid-`hack/install-gsutil.sh`, which means the ecr-credential-provider binaries never reach gs://k8s-staging-provider-aws/releases/. Observed failures after #1399 landed: - release-1.36: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051718180123971584 (step 1 ok, step 2 TIMEOUT installing gsutil) - master: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051514822092132352 (step 1 buildkit session deadline during registry push) Cross-check: the last successful binary upload to the staging releases bucket was v1.35.1-5-g7dac1f6 on 2026-03-27 — step 2 has effectively been timing out for over a month, independent of the image pin issue. Bumping the overall build timeout to 3600s (60 min) gives headroom for GCB variability, keeps step 1 multi-arch pushes reliable, and leaves enough budget for the existing install-gsutil + upload flow. A follow-up PR can skip the SDK reinstall by switching the step 2 base image. Signed-off-by: Ganesh Putta <ganiredi@amazon.com>
The cloud-provider-aws-push-images postsubmit has been running up against the 1200s (20 min) Cloud Build timeout: - The new gcb-docker-gcloud image (pinned in #1399) is larger and pushes/pulls take slightly longer, and GCB's shared pool has been slower overall, so step 1 (multi-arch buildx) routinely now uses 16-19 of the 20 available minutes. - Step 2 (cloudbuild-artifacts) then has almost no budget left and is killed mid-`hack/install-gsutil.sh`, which means the ecr-credential-provider binaries never reach gs://k8s-staging-provider-aws/releases/. Observed failures after #1399 landed: - release-1.36: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051718180123971584 (step 1 ok, step 2 TIMEOUT installing gsutil) - master: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051514822092132352 (step 1 buildkit session deadline during registry push) Cross-check: the last successful binary upload to the staging releases bucket was v1.35.1-5-g7dac1f6 on 2026-03-27 — step 2 has effectively been timing out for over a month, independent of the image pin issue. Bumping the overall build timeout to 3600s (60 min) gives headroom for GCB variability, keeps step 1 multi-arch pushes reliable, and leaves enough budget for the existing install-gsutil + upload flow. A follow-up PR can skip the SDK reinstall by switching the step 2 base image. Signed-off-by: Ganesh Putta <ganiredi@amazon.com>
The cloud-provider-aws-push-images postsubmit has been running up against the 1200s (20 min) Cloud Build timeout: - The new gcb-docker-gcloud image (pinned in #1399) is larger and pushes/pulls take slightly longer, and GCB's shared pool has been slower overall, so step 1 (multi-arch buildx) routinely now uses 16-19 of the 20 available minutes. - Step 2 (cloudbuild-artifacts) then has almost no budget left and is killed mid-`hack/install-gsutil.sh`, which means the ecr-credential-provider binaries never reach gs://k8s-staging-provider-aws/releases/. Observed failures after #1399 landed: - release-1.36: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051718180123971584 (step 1 ok, step 2 TIMEOUT installing gsutil) - master: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051514822092132352 (step 1 buildkit session deadline during registry push) Cross-check: the last successful binary upload to the staging releases bucket was v1.35.1-5-g7dac1f6 on 2026-03-27 — step 2 has effectively been timing out for over a month, independent of the image pin issue. Bumping the overall build timeout to 3600s (60 min) gives headroom for GCB variability, keeps step 1 multi-arch pushes reliable, and leaves enough budget for the existing install-gsutil + upload flow. A follow-up PR can skip the SDK reinstall by switching the step 2 base image. Signed-off-by: Ganesh Putta <ganiredi@amazon.com>
The cloud-provider-aws-push-images postsubmit has been running up against the 1200s (20 min) Cloud Build timeout: - The new gcb-docker-gcloud image (pinned in #1399) is larger and pushes/pulls take slightly longer, and GCB's shared pool has been slower overall, so step 1 (multi-arch buildx) routinely now uses 16-19 of the 20 available minutes. - Step 2 (cloudbuild-artifacts) then has almost no budget left and is killed mid-`hack/install-gsutil.sh`, which means the ecr-credential-provider binaries never reach gs://k8s-staging-provider-aws/releases/. Observed failures after #1399 landed: - release-1.36: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051718180123971584 (step 1 ok, step 2 TIMEOUT installing gsutil) - master: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051514822092132352 (step 1 buildkit session deadline during registry push) Cross-check: the last successful binary upload to the staging releases bucket was v1.35.1-5-g7dac1f6 on 2026-03-27 — step 2 has effectively been timing out for over a month, independent of the image pin issue. Bumping the overall build timeout to 3600s (60 min) gives headroom for GCB variability, keeps step 1 multi-arch pushes reliable, and leaves enough budget for the existing install-gsutil + upload flow. A follow-up PR can skip the SDK reinstall by switching the step 2 base image. Signed-off-by: Ganesh Putta <ganiredi@amazon.com>
The cloud-provider-aws-push-images postsubmit has been running up against the 1200s (20 min) Cloud Build timeout: - The new gcb-docker-gcloud image (pinned in #1399) is larger and pushes/pulls take slightly longer, and GCB's shared pool has been slower overall, so step 1 (multi-arch buildx) routinely now uses 16-19 of the 20 available minutes. - Step 2 (cloudbuild-artifacts) then has almost no budget left and is killed mid-`hack/install-gsutil.sh`, which means the ecr-credential-provider binaries never reach gs://k8s-staging-provider-aws/releases/. Observed failures after #1399 landed: - release-1.36: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051718180123971584 (step 1 ok, step 2 TIMEOUT installing gsutil) - master: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051514822092132352 (step 1 buildkit session deadline during registry push) Cross-check: the last successful binary upload to the staging releases bucket was v1.35.1-5-g7dac1f6 on 2026-03-27 — step 2 has effectively been timing out for over a month, independent of the image pin issue. Bumping the overall build timeout to 3600s (60 min) gives headroom for GCB variability, keeps step 1 multi-arch pushes reliable, and leaves enough budget for the existing install-gsutil + upload flow. A follow-up PR can skip the SDK reinstall by switching the step 2 base image. Signed-off-by: Ganesh Putta <ganiredi@amazon.com>
The cloud-provider-aws-push-images postsubmit has been running up against the 1200s (20 min) Cloud Build timeout: - The new gcb-docker-gcloud image (pinned in #1399) is larger and pushes/pulls take slightly longer, and GCB's shared pool has been slower overall, so step 1 (multi-arch buildx) routinely now uses 16-19 of the 20 available minutes. - Step 2 (cloudbuild-artifacts) then has almost no budget left and is killed mid-`hack/install-gsutil.sh`, which means the ecr-credential-provider binaries never reach gs://k8s-staging-provider-aws/releases/. Observed failures after #1399 landed: - release-1.36: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051718180123971584 (step 1 ok, step 2 TIMEOUT installing gsutil) - master: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051514822092132352 (step 1 buildkit session deadline during registry push) Cross-check: the last successful binary upload to the staging releases bucket was v1.35.1-5-g7dac1f6 on 2026-03-27 — step 2 has effectively been timing out for over a month, independent of the image pin issue. Bumping the overall build timeout to 3600s (60 min) gives headroom for GCB variability, keeps step 1 multi-arch pushes reliable, and leaves enough budget for the existing install-gsutil + upload flow. A follow-up PR can skip the SDK reinstall by switching the step 2 base image. Signed-off-by: Ganesh Putta <ganiredi@amazon.com>
The cloud-provider-aws-push-images postsubmit has been running up against the 1200s (20 min) Cloud Build timeout: - The new gcb-docker-gcloud image (pinned in #1399) is larger and pushes/pulls take slightly longer, and GCB's shared pool has been slower overall, so step 1 (multi-arch buildx) routinely now uses 16-19 of the 20 available minutes. - Step 2 (cloudbuild-artifacts) then has almost no budget left and is killed mid-`hack/install-gsutil.sh`, which means the ecr-credential-provider binaries never reach gs://k8s-staging-provider-aws/releases/. Observed failures after #1399 landed: - release-1.36: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051718180123971584 (step 1 ok, step 2 TIMEOUT installing gsutil) - master: https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051514822092132352 (step 1 buildkit session deadline during registry push) Cross-check: the last successful binary upload to the staging releases bucket was v1.35.1-5-g7dac1f6 on 2026-03-27 — step 2 has effectively been timing out for over a month, independent of the image pin issue. Bumping the overall build timeout to 3600s (60 min) gives headroom for GCB variability, keeps step 1 multi-arch pushes reliable, and leaves enough budget for the existing install-gsutil + upload flow. A follow-up PR can skip the SDK reinstall by switching the step 2 base image. Signed-off-by: Ganesh Putta <ganiredi@amazon.com>
The cloud-provider-aws-push-images postsubmit has been failing across all branches (master and release-*) since early April 2026 because cloudbuild.yaml pins
gcr.io/k8s-staging-test-infra/gcb-docker-gcloud:v20221214-1b4dd4d69a, and that tag has been garbage-collected out of the staging registry. Cloud Build retries the pull 10 times and fails with 'manifest unknown: Failed to fetch "v20221214-1b4dd4d69a"'.
Example failed run (v1.36.0 tag):
https://prow.k8s.io/view/gs/kubernetes-ci-logs/logs/cloud-provider-aws-push-images/2051449456091467776
Bumping to v20260205-38cfa9523f (digest
sha256:ff388e0dc16351e96f8464e2e185b74a7578a5ccb7a112cf3393468e59e6e2d2), currently the newest tag in gcr.io/k8s-staging-test-infra/gcb-docker-gcloud and aliased to 'latest'. This image still provides /buildx-entrypoint used by the build step.
What type of PR is this?
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: