Notification if container is unealthy > renew #40

Open

Open

Notification if container is unealthy > renew#40

Is the listener is capable of sending notification if a container in a service is unhealthy then renew ?

Thanks

Contributor

When a service becomes healthy, DFSL will send a notification when it is to become healthy later.

Author

This is not what happen with version 18.11.28-19 : when my service is unhealthy, docker swarm kill the container and restart another one. But the listener don't send any notification.

Do you need more information ?

Contributor

Can you provide information about how your service is set up?

Author

my services are set up like this :

version: '3.4'

services:
  webapp-front-http:
    image: apache:latest
    ports:
        - "80"
    healthcheck: 
      test: "curl -f http://127.0.0.1/server-status?auto || exit 1"
      interval: 60s
      timeout: 5s
      retries: 5
      start_period: 10s
    deploy:
      mode: replicated
      replicas: 3
      labels:
        com.df.consulName: 'front-http'
        com.df.stackName: 'regiecamp_webapp'
        com.df.scrapeNetwork: 'regiecamp_webapp_default'
        com.df.notify: 'true'
        com.df.port: '80'
      restart_policy:
        condition: on-failure
      update_config:
        order: start-first
        parallelism: 3

Contributor

As a debugging option, can you listen to docker events by running: docker events -f type=service and see what events firing when:

Service starts
Service gets unhealthy
Server becomes healthy

I didn't ask before, does your service become healthy at some point?

Author

The container is killed by docker and a new one is restarted after that, So the service become healthy when the new container is started.

I try the debugging process and give you a feedback

Author

So I try the command :
docker events -f type=service

sample of response :

2019-01-03T00:11:04.055685338+01:00 service update hb5rrsl3fm2yc1gpajrsdutkp (name=xxxxxxxxx_webapp_webapp-api-php)
2019-01-03T00:11:04.080270513+01:00 service update hb5rrsl3fm2yc1gpajrsdutkp (name=xxxxxxxxx_webapp_webapp-api-php, updatestate.new=updating)
2019-01-03T00:11:12.256159085+01:00 service update hb5rrsl3fm2yc1gpajrsdutkp (name=xxxxxxxxx_webapp_webapp-api-php, updatestate.new=completed, updatestate.old=updating)
2019-01-03T00:11:22.021262027+01:00 service update ptq3swwaw43cn7knowy4664aj (name=xxxxxxxxx_webapp_webapp-front-php)
2019-01-03T00:11:22.031196553+01:00 service update ptq3swwaw43cn7knowy4664aj (name=xxxxxxxxx_webapp_webapp-front-php, updatestate.new=updating)
2019-01-03T00:11:29.965940816+01:00 service update ptq3swwaw43cn7knowy4664aj (name=xxxxxxxxx_webapp_webapp-front-php, updatestate.new=completed, updatestate.old=updating)
2019-01-03T00:12:17.609856908+01:00 service update wtcglpmm9ewbeh9uxl7kgnapl (name=xxxxxxxxx_webapp_webapp-front-http)
2019-01-03T00:12:17.617948834+01:00 service update p12hfbl977gqgaqrf6w6o2636 (name=xxxxxxxxx_webapp_webapp-api-http)
2019-01-03T00:12:17.630087290+01:00 service update p12hfbl977gqgaqrf6w6o2636 (name=xxxxxxxxx_webapp_webapp-api-http, updatestate.new=updating)
2019-01-03T00:12:17.641321163+01:00 service update wtcglpmm9ewbeh9uxl7kgnapl (name=xxxxxxxxx_webapp_webapp-front-http, updatestate.new=updating)
2019-01-03T00:13:26.301852250+01:00 service update wtcglpmm9ewbeh9uxl7kgnapl (name=xxxxxxxxx_webapp_webapp-front-http, updatestate.new=completed, updatestate.old=updating)
2019-01-03T00:13:26.412017055+01:00 service update p12hfbl977gqgaqrf6w6o2636 (name=xxxxxxxxx_webapp_webapp-api-http, updatestate.new=completed, updatestate.old=updating)

I cannot make my service unhealthy right know, but I tried to kill a container in a service, nothing is fired up in service events. Is that normal ?

Contributor

When you kill the container, does the service start up again? If I recall, this would not fire an service event.

We have the same issue when a container exit and when docker restart it.

Contributor

@sguilly May you provide more details about your issue?

Answering for @sguilly: when one of our container exits (normal operation or killed by an error) and is restarted by docker swarm, the listener doesn't "see" the new container. The restart is set in docker-compose:

  version: '3.4'

  networks:
    proxy_proxy:
      external: true

  services:

    api:
      image: <our api>

      networks:
        - proxy_proxy

      deploy:
        mode: replicated
        replicas: 4

        update_config:
          parallelism: 1
          delay: 10s
          order: start-first
          failure_action: rollback
          monitor: 30s

        restart_policy:
          # no / any / on-failure
          condition: any
          delay: 30s
          max_attempts: 3000

        resources:
          limits:
            memory: 500M

        labels:
          - com.df.notify=true
          - com.df.scrapeNetwork=proxy_proxy
          - com.df.scrapePort=14000
          - com.df.env=production
          - com.df.metricType=api
          - com.df.alertName=errorsRate
          - "com.df.alertAnnotations=summary=API error rate is high"
          - "com.df.alertLabels=severity=high"
          - 'com.df.alertIf=(sum(rate(http_request_duration_ms_count{code=~"^5..$$"}[1m])) / sum(rate(http_request_duration_ms_count[1m]))) > 0.05'

      environment:
        NODE_ENV: production

If we restart docker-flow-swarm-listener service, the container shows up. But if we don't restart the service, Prometheus display the following error:

Get http://10.0.11.213:14000/metrics: dial tcp 10.0.11.213:14000: connect: no route to host

Contributor

@Mualig Which docker version are you using?

We have multiple nodes in our swarm. Monitor is always deployed on a node with docker 18.06.1-ce (this version is on the majority of the nodes), but some are on docker 18.03.1-ce and one is on docker 18.09.0.

to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Participants