Closed
Description
Bug description
We are running z2jh (https://z2jh.jupyter.org/en/stable/) and found there is a socket leak in proxy
pod.
The number of socket is constantly increasing (over 60k) and the kernel generates an error after a week ,kernel: TCP: out of memory -- consider tuning tcp_mem
.
I have checked the number of sockets using lsof.
/srv/configurable-http-proxy $ lsof
1 /usr/local/bin/node socket:[48829679]
1 /usr/local/bin/node socket:[48829681]
1 /usr/local/bin/node socket:[48825415]
1 /usr/local/bin/node socket:[48825417]
1 /usr/local/bin/node socket:[48829792]
1 /usr/local/bin/node socket:[48829790]
1 /usr/local/bin/node socket:[48829783]
1 /usr/local/bin/node socket:[48829785]
/srv/configurable-http-proxy $ lsof | wc -l
64708
/srv/configurable-http-proxy $ lsof | wc -l
64719
Your personal set up
- Version(s): Jupyterhub Helm chart: v1.1.2 ( https://jupyterhub.github.io/helm-chart/)
This chart use the proxy docker image from jupyterhub/configurable-http-proxy:4.5.0
The config.yaml related to proxy
proxy:
secretToken: xxxx
service:
loadBalancerIP: x.x.x.x
https:
enabled: true
hosts:
- "exmaple.com"
letsencrypt:
contactEmail: "[email protected]"
chp: # proxy pod, running jupyterhub/configurable-http-proxy
livenessProbe:
enabled: true
initialDelaySeconds: 60
periodSeconds: 20
failureThreshold: 10 # retry 10 times before declaring failure
timeoutSeconds: 3
successThreshold: 1
resources:
requests:
cpu: 1000m # 0m - 1000m
memory: 5000Mi # Recommended is 100Mi - 600Mi -- we seem to run at 4.3GB a lot
traefik: # autohttps pod (optional, running traefik/traefik)
resources:
requests:
cpu: 1000m # 0m - 1000m
memory: 512Mi # 100Mi - 1.1Gi
secretSync: # autohttps pod (optional, sidecar container running small Python script)
resources:
requests:
cpu: 10m
memory: 64Mi