-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Increased memory usage with 3.6 event reuse #21355
Copy link
Copy link
Open
Labels
priority/critical-urgentHighest priority. Must be actively worked on as someone's top priority right now.Highest priority. Must be actively worked on as someone's top priority right now.type/bug
Description
Bug report criteria
- This bug report is not security related, security issues should be disclosed privately via security@etcd.io.
- This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
- You have read the etcd bug reporting guidelines.
- Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.
What happened?
@mcornea has been load testing 3.6.5 with OpenShift 4.21 on AWS lately and found we have a memory regression with large clusters and kube density tests.
| Cluster Size | etcd 3.5 Avg RSS | etcd 3.6 Avg RSS | Avg Change | etcd 3.5 Max RSS | etcd 3.6 Max RSS | Max Change |
|---|---|---|---|---|---|---|
| 24 nodes | 359 MiB | 360 MiB | +0.3% | 468 MiB | 541 MiB | +15.6% |
| 120 nodes | 550 MiB | 601 MiB | +9.3% | 893 MiB | 1.24 GiB | +42.2% |
| 250 nodes | 682 MiB | 974 MiB | +42.8% | 1.13 GiB | 2.96 GiB | +162% |
We eventually were able to bisect this and found #17563 as the culprit.
Marius has kindly tested this in a running cluster where you can see the difference in memory:
There is a pprof and an SVG render for your convenience:
What did you expect to happen?
In #17563 we expected that memory would decrease drastically, it seems that it has the opposite effect on the density workload.
How can we reproduce it (as minimally and precisely as possible)?
- Deploy a larger 1.34 Kubernetes cluster with etcd 3.6
- Run the kubeburner cluster-density-v2 workload, observe RSS memory metrics
Compare the same using etcd 3.5.
Anything else we need to know?
No response
Etcd version (please run commands below)
3.6.5
Etcd configuration (command line flags or environment variables)
Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)
No response
Relevant log output
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
priority/critical-urgentHighest priority. Must be actively worked on as someone's top priority right now.Highest priority. Must be actively worked on as someone's top priority right now.type/bug