-
Notifications
You must be signed in to change notification settings - Fork 220
PollingEventSource should trigger reconcile when a secondary resource is missing #885
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hi @scrocquesel
This is an expected behavior, event should be propagated if a dependent resource is deleted. It's not clear for if you meant this as a problem?
Do I understand correctly, you mean that basically a triggering event should be propagated on startup, since the dependent resource was deleted meanwhile in the background? (Sorry just try to understand what do you mean.) |
No this is indeed expected.
Yes, otherwise it is not possible to recreate the missing dependent resource. The only way to trigger reconciliation is to update the primary resource. |
@scrocquesel one way to solve this is set
The problem is that this would not even be consistent with the Informers (Neither with PerResourcePollingEventSource) . So if an operator is down, and a k8s resources is deleted meanwhile. On startup the informer does not propagate an event. Also what you are saying is not equivalent to: on startup there should be an event for every resource? |
Was pretty sure it was the case for
As you said, this is the expected behavior when Somehow, we must reconcile with the dependent resource that has been deleted when the operator was down. Modified resource when the operator is down works. As far as I understand, it triggers reconcile not on startup but when the next polling occured. I guess that as the cache is empty, it detect that there is now an object and it is different. |
I think that it should make the difference between knowing the resource doesn't exist (poller return Optional.empty() and when it is unkown like at startup and the cache is really empty. Something like
That is the cache should be a I did a quick and dirty dev so it works as I expected for But I don't know all the picture and there may be cons I don't see like memory overhead. |
yes, because the secret was there. If it would be deleted meanwhile, it would not trigger the reconciliation.
The problem is that you don't know if it was deleted or it's just not created yet. The modified works because it's there, basically that is reported. TBH I don't see how this should help in general, you need to know the resourceID already, but isn't it comming from the polled resource / poller? In case the operator is replicated (we don't support it yet), this should not be the problem. But this could be an issue also with Informers on deleted objects, so probably Will create a PR to not delay the PollingEventSource on startup, but for now I think what you should do is just set this to false. Anyways, thanks very much to pointing this out! |
This is loosely related: #901 |
From the reconcile pov, it make no difference. We are not doing reconciliation based on add/delete/modify delta operation. And it will not be possible to pass such an operation when calling reconcile. to me, handleDelete and handleEvent are only for managing the cache and decide if the reconcile should be triggered. I see three use case:
As you said, maybe the easiest way is to trigger reconcile for all resources at startup to allow the cache to be populated with a reconciliated observed state of dependent resources. |
@sclorng I completely agree, form the comment before. @sclorng @scrocquesel So again not sure we are understand each other,(happy to make a call on our discord server or somewhere to discuss, them make a summery under this issues), but: Made a change under this PR: #901 So from now on startup PollingEventSource will synchronize, that means that if you check the resource on the reconciliation on startup, if the resource is not there it means it was not created yet, see: So in my opinion the workflow should be the following (after this feature is merged, will be released today to tomorrow):
Do you agree with this, does this solve the problem that we are in kinda intermediate state? If yes will add it also into (java)docs for now. Sorry, this was a relatively new stuff, I exactly anticipated feedback like this. So thank you again. And happy to discuss also in a call in depth. |
For the manual step, this is the same issue as #870. Even if we don't put it in the cache, it will latter poll and eventually reconcile to sync. The important thing to document is that the cache should be called once per unit of work for each resource as it can't keep track of what's being done during this unit of work. UoW = reconcile or cleanup.
Wouldn't it be better to lazy populate the cache ? When calling getAssociated if the cache has not yet been |
Yes, it is very similar, but it serves also a different purpose. In #870 it is mainly to not propagate the event. On polling event source this is an aspect too. Bu in PollingEventSource is an also more probabilistic situation in place. If you don't populate itmanually, there can be a new reconciliation between the current reconciliation and the next poll (note the time period) that will actually propagate the cache. If that happens the reconciler will again look into the cache and the cache will be still empty.
For |
I was thinking of doing that only the first call to the cache. Then, a flag would say it has poll at least once so subsequent call will rely on this state (or the update one by the timer). So it is just delaying the first |
Well, yes that is an interesting idea, it could be optimized this way. Not sure if in real life scenarios helps that much, so you eventually will have a custom resource in that state. And on an operator restart it would be good to poll before first sync as done now in that PR. But not against it. Feel free to create a PR, maybe a separate issue, to handle that. |
Bug Report
I'm was using
PerResourcePollingEventSource
but when having a lot of primary resources, it can lead to a lot of presure on the external system.I refactor around a
PollingEventSource
to batch external resources get but the difference of behavior make it is not a drop in replacement.What did you do?
ObservedGenerationAwareStatus
.If I create manually the secondary resource and wait for the polling period to grab the new secondary resource, reconcile is triggered, and if the secondary resource is then manually deleted, it will also trigger the reconcile on next polling.
What did you expect to see?
At startup, reconcile should be called for each primary resource not in the map returned by the PollingEventSource supplier.
Ideally,
PollingEventSource
andPerResourcePollingEventSource
should be an implementation details and should not change the way the reconciler works.The text was updated successfully, but these errors were encountered: