-
Notifications
You must be signed in to change notification settings - Fork 222
Description
Bug Report
What did you do?
We ran into this issue when using unsupported beta features on an older Kubernetes version:
- Create a StatefulSet dependent resource with persistentVolumeClaimRetentionPolicy on a cluster running Kubernetes version 1.26.
- Kubernetes adds persistentVolumeClaimRetentionPolicy to the managedFields but ignores the value
- When getting the StatefulSet, the SDK expects all managed fields to be present in the resource but in this case, persistentVolumeClaimRetentionPolicy is missing.
What did you expect to see?
At most, a warning should be printed.
What did you see instead? Under which circumstances?
Error during event processing ExecutionScope{ resource id: ResourceID{name='redacted', namespace='redacted'}, version: 8966444451} failed.
io.javaoperatorsdk.operator.AggregatedOperatorException: Exception(s) during workflow execution. Details:
- eu.glasskube.operator.apps.vault.dependent.VaultStatefulSet -> java.lang.NullPointerException: Cannot invoke "java.util.Map.get(Object)" because "actualMap" is null
at io.javaoperatorsdk.operator.processing.dependent.kubernetes.SSABasedGenericKubernetesResourceMatcher.keepOnlyManagedFields(SSABasedGenericKubernetesResourceMatcher.java:144)
at io.javaoperatorsdk.operator.processing.dependent.kubernetes.SSABasedGenericKubernetesResourceMatcher.fillResultsAndTraverseFurther(SSABasedGenericKubernetesResourceMatcher.java:166)
at io.javaoperatorsdk.operator.processing.dependent.kubernetes.SSABasedGenericKubernetesResourceMatcher.keepOnlyManagedFields(SSABasedGenericKubernetesResourceMatcher.java:138)
at io.javaoperatorsdk.operator.processing.dependent.kubernetes.SSABasedGenericKubernetesResourceMatcher.fillResultsAndTraverseFurther(SSABasedGenericKubernetesResourceMatcher.java:166)
at io.javaoperatorsdk.operator.processing.dependent.kubernetes.SSABasedGenericKubernetesResourceMatcher.keepOnlyManagedFields(SSABasedGenericKubernetesResourceMatcher.java:138)
at io.javaoperatorsdk.operator.processing.dependent.kubernetes.SSABasedGenericKubernetesResourceMatcher.matches(SSABasedGenericKubernetesResourceMatcher.java:90)
at io.javaoperatorsdk.operator.processing.dependent.kubernetes.KubernetesDependentResource.match(KubernetesDependentResource.java:169)
at io.javaoperatorsdk.operator.processing.dependent.kubernetes.KubernetesDependentResource.match(KubernetesDependentResource.java:32)
at io.javaoperatorsdk.operator.processing.dependent.AbstractDependentResource.reconcile(AbstractDependentResource.java:67)
at io.javaoperatorsdk.operator.processing.dependent.SingleDependentResourceReconciler.reconcile(SingleDependentResourceReconciler.java:19)
at io.javaoperatorsdk.operator.processing.dependent.AbstractDependentResource.reconcile(AbstractDependentResource.java:52)
at io.javaoperatorsdk.operator.processing.dependent.workflow.WorkflowReconcileExecutor$NodeReconcileExecutor.doRun(WorkflowReconcileExecutor.java:115)
at io.javaoperatorsdk.operator.processing.dependent.workflow.NodeExecutor.run(NodeExecutor.java:22)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
at java.base/java.util.concurrent.FutureTask.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
at io.javaoperatorsdk.operator.processing.dependent.workflow.WorkflowResult.throwAggregateExceptionIfErrorsPresent(WorkflowResult.java:41)
at io.javaoperatorsdk.operator.processing.dependent.workflow.WorkflowReconcileResult.throwAggregateExceptionIfErrorsPresent(WorkflowReconcileResult.java:9)
at io.javaoperatorsdk.operator.processing.dependent.workflow.DefaultWorkflow.reconcile(DefaultWorkflow.java:95)
at io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:147)
at io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:110)
at io.javaoperatorsdk.operator.api.monitoring.Metrics.timeControllerExecution(Metrics.java:219)
at io.javaoperatorsdk.operator.processing.Controller.reconcile(Controller.java:109)
at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.reconcileExecution(ReconciliationDispatcher.java:140)
at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:121)
at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:91)
at io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:64)
at io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:409)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.base/java.lang.Thread.run(Unknown Source)
Environment
Kubernetes cluster type:
vanilla
$ Mention java-operator-sdk version from pom.xml file
4.4.2
$ java -version
openjdk version "17.0.8" 2023-07-18
OpenJDK Runtime Environment (Red_Hat-17.0.8.0.7-1.fc38) (build 17.0.8+7)
OpenJDK 64-Bit Server VM (Red_Hat-17.0.8.0.7-1.fc38) (build 17.0.8+7, mixed mode, sharing)
$ kubectl version
Client Version: v1.28.0
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.26.4
WARNING: version difference between client (1.28) and server (1.26) exceeds the supported minor version skew of +/-1
Possible Solution
Check if actualMap
is null
before adding to the result.
Additional context
Related: glasskube/operator#267
Metadata
Metadata
Assignees
Labels
No labels
Activity
shawkins commentedon Aug 24, 2023
Relates to #2028 - it's the same NPE that happens when using a secret with stringData in the desired state.
csviri commentedon Aug 25, 2023
Hi @kosmoz , on short notice you might want to try the generic matcher, see:
https://github.com/operator-framework/java-operator-sdk/blob/main/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/processing/dependent/kubernetes/GenericKubernetesResourceMatcher.java#L19-L19
also:
https://javaoperatorsdk.io/docs/v4-4-migration#using-server-side-apply-in-dependent-resources
Would be great if you could provide a simple project to reproduce. Or provide the logs from here: https://github.com/shawkins/java-operator-sdk/blob/c879806ba279c2a285639344a7e34c01118c8c01/operator-framework-core/src/main/java/io/javaoperatorsdk/operator/processing/dependent/kubernetes/SSABasedGenericKubernetesResourceMatcher.java#L86-L86
thank you!
kosmoz commentedon Aug 25, 2023
Hi! I'm not sure if I will have time to create a minimal reprodicution repo. However the steps would be pretty straight forward:
persistentVolumeClaimRetentionPolicy
is set to something other thannull
.Maybe I will have time next week, but in the meantime, here are the logs (I pruned them to the lines that I think matter for this error).
Notice how in the last line, it says
actual map value: null
. This is not somethingkeepOnlyManagedFields
is designed to deal with, so that's where the error happens.Side note: This can probably happen for other resources with unsupported properties as well. I don't understand why Kubernetes keeps the managed field but discards the property…
Thanks for your suggested workaround, we already removed the offending statement, since we want to support version 1.26 anyways. I just thought that the SDK could handle this edge-case more gracefully 🙂
Edit: For testing I created a cluster like this:
shawkins commentedon Aug 25, 2023
I believe that case is happening as well here: kubernetes/kubernetes#118519 (comment)
While you won't get the NPE the matching won't work for pruned fields - the match logic will see that desired state has something that is missing in the actual.
csviri commentedon Aug 29, 2023
see #2038