Fix route-controller by fanning out getInstancesByIDs through the batcher#1388
Fix route-controller by fanning out getInstancesByIDs through the batcher#1388babyhuey wants to merge 1 commit into
Conversation
|
|
|
This issue is currently awaiting triage. If cloud-provider-aws contributors determine this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @babyhuey! |
|
Hi @babyhuey. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
/check-cla |
1041ac0 to
feeba59
Compare
| InstanceIds: instanceIDs, | ||
| } | ||
|
|
||
| instances, err := c.describeInstanceBatcher.DescribeInstances(ctx, request) |
There was a problem hiding this comment.
I think we should add the functionality in the batcher instead of the client making complex logic
feeba59 to
6c1ebf1
Compare
|
Thanks @kmala, that makes sense. I moved the fan out into |
The describeInstanceBatcher currently rejects DescribeInstances inputs with anything other than a single instance ID, which breaks every caller of getInstancesByIDs that legitimately passes multiple IDs (notably the route controller's UpdateRoutes path). See kubernetes#1351. Move the fan-out logic into the batcher itself: multi-ID inputs are split into one batcher submission per ID so the batcher can coalesce them with other in-flight requests, and the aggregated results (and joined errors, if any) are returned to the caller. The caller in getInstancesByIDs is unchanged. Adds a regression test exercising a multi-ID input through the batcher.
6c1ebf1 to
7e2c805
Compare
|
/ok-to-test |
What type of PR is this?
/kind bug
What this PR does / why we need it:
describeInstanceBatcher.DescribeInstancesrequires an input with exactly one instance ID:https://github.com/kubernetes/cloud-provider-aws/blob/master/pkg/providers/v1/describe_instance_batch.go#L56-L58
The batcher's
BatchExecutoris what aggregates concurrent single-instance inputs back into oneDescribeInstancesAPI call. ButgetInstancesByIDswas passing the full slice of instance IDs as a single input, which trips that guard and errors out any time more than one instance is looked up.The most visible consequence is that in clusters using kubenet / route-based pod networking, the route-controller's initial
ListRoutescall always needs to look up every node and therefore always fails withexpected to receive a single instance only, found N. No pod-CIDR VPC routes get created and source-dest-check is left enabled on new workers, so pods on new nodes have no connectivity — see #1351.This PR fans out one goroutine per instance ID, each passing a single-instance input into the batcher. The batcher then coalesces those concurrent calls into one underlying
DescribeInstancesAPI call, preserving the original batching behavior while honoring the single-instance contract.Which issue(s) this PR fixes:
Fixes #1351
Special notes for your reviewer:
go build ./...cleango test ./pkg/providers/v1/...passesgetInstancesByIDs(aws.go:3412) already passes a single-ID slice, so its behavior is unchanged.Does this PR introduce a user-facing change?