Skip to content

Notification if container is unealthy > renew #40

Open
@nlevee

Description

@nlevee

Is the listener is capable of sending notification if a container in a service is unhealthy then renew ?

Thanks

Activity

thomasjpfan

thomasjpfan commented on Dec 22, 2018

@thomasjpfan
Contributor

When a service becomes healthy, DFSL will send a notification when it is to become healthy later.

nlevee

nlevee commented on Jan 2, 2019

@nlevee
Author

This is not what happen with version 18.11.28-19 : when my service is unhealthy, docker swarm kill the container and restart another one. But the listener don't send any notification.

Do you need more information ?

thomasjpfan

thomasjpfan commented on Jan 2, 2019

@thomasjpfan
Contributor

Can you provide information about how your service is set up?

nlevee

nlevee commented on Jan 2, 2019

@nlevee
Author

my services are set up like this :

version: '3.4'

services:
  webapp-front-http:
    image: apache:latest
    ports:
        - "80"
    healthcheck: 
      test: "curl -f http://127.0.0.1/server-status?auto || exit 1"
      interval: 60s
      timeout: 5s
      retries: 5
      start_period: 10s
    deploy:
      mode: replicated
      replicas: 3
      labels:
        com.df.consulName: 'front-http'
        com.df.stackName: 'regiecamp_webapp'
        com.df.scrapeNetwork: 'regiecamp_webapp_default'
        com.df.notify: 'true'
        com.df.port: '80'
      restart_policy:
        condition: on-failure
      update_config:
        order: start-first
        parallelism: 3
thomasjpfan

thomasjpfan commented on Jan 3, 2019

@thomasjpfan
Contributor

As a debugging option, can you listen to docker events by running: docker events -f type=service and see what events firing when:

  1. Service starts
  2. Service gets unhealthy
  3. Server becomes healthy

I didn't ask before, does your service become healthy at some point?

nlevee

nlevee commented on Jan 3, 2019

@nlevee
Author

The container is killed by docker and a new one is restarted after that, So the service become healthy when the new container is started.

I try the debugging process and give you a feedback

nlevee

nlevee commented on Jan 3, 2019

@nlevee
Author

So I try the command :
docker events -f type=service

sample of response :

2019-01-03T00:11:04.055685338+01:00 service update hb5rrsl3fm2yc1gpajrsdutkp (name=xxxxxxxxx_webapp_webapp-api-php)
2019-01-03T00:11:04.080270513+01:00 service update hb5rrsl3fm2yc1gpajrsdutkp (name=xxxxxxxxx_webapp_webapp-api-php, updatestate.new=updating)
2019-01-03T00:11:12.256159085+01:00 service update hb5rrsl3fm2yc1gpajrsdutkp (name=xxxxxxxxx_webapp_webapp-api-php, updatestate.new=completed, updatestate.old=updating)
2019-01-03T00:11:22.021262027+01:00 service update ptq3swwaw43cn7knowy4664aj (name=xxxxxxxxx_webapp_webapp-front-php)
2019-01-03T00:11:22.031196553+01:00 service update ptq3swwaw43cn7knowy4664aj (name=xxxxxxxxx_webapp_webapp-front-php, updatestate.new=updating)
2019-01-03T00:11:29.965940816+01:00 service update ptq3swwaw43cn7knowy4664aj (name=xxxxxxxxx_webapp_webapp-front-php, updatestate.new=completed, updatestate.old=updating)
2019-01-03T00:12:17.609856908+01:00 service update wtcglpmm9ewbeh9uxl7kgnapl (name=xxxxxxxxx_webapp_webapp-front-http)
2019-01-03T00:12:17.617948834+01:00 service update p12hfbl977gqgaqrf6w6o2636 (name=xxxxxxxxx_webapp_webapp-api-http)
2019-01-03T00:12:17.630087290+01:00 service update p12hfbl977gqgaqrf6w6o2636 (name=xxxxxxxxx_webapp_webapp-api-http, updatestate.new=updating)
2019-01-03T00:12:17.641321163+01:00 service update wtcglpmm9ewbeh9uxl7kgnapl (name=xxxxxxxxx_webapp_webapp-front-http, updatestate.new=updating)
2019-01-03T00:13:26.301852250+01:00 service update wtcglpmm9ewbeh9uxl7kgnapl (name=xxxxxxxxx_webapp_webapp-front-http, updatestate.new=completed, updatestate.old=updating)
2019-01-03T00:13:26.412017055+01:00 service update p12hfbl977gqgaqrf6w6o2636 (name=xxxxxxxxx_webapp_webapp-api-http, updatestate.new=completed, updatestate.old=updating)

I cannot make my service unhealthy right know, but I tried to kill a container in a service, nothing is fired up in service events. Is that normal ?

thomasjpfan

thomasjpfan commented on Jan 3, 2019

@thomasjpfan
Contributor

When you kill the container, does the service start up again? If I recall, this would not fire an service event.

sguilly

sguilly commented on Jan 29, 2019

@sguilly

We have the same issue when a container exit and when docker restart it.

thomasjpfan

thomasjpfan commented on Jan 29, 2019

@thomasjpfan
Contributor

@sguilly May you provide more details about your issue?

Mualig

Mualig commented on Feb 4, 2019

@Mualig

Answering for @sguilly: when one of our container exits (normal operation or killed by an error) and is restarted by docker swarm, the listener doesn't "see" the new container. The restart is set in docker-compose:

  version: '3.4'

  networks:
    proxy_proxy:
      external: true

  services:

    api:
      image: <our api>

      networks:
        - proxy_proxy

      deploy:
        mode: replicated
        replicas: 4

        update_config:
          parallelism: 1
          delay: 10s
          order: start-first
          failure_action: rollback
          monitor: 30s

        restart_policy:
          # no / any / on-failure
          condition: any
          delay: 30s
          max_attempts: 3000

        resources:
          limits:
            memory: 500M

        labels:
          - com.df.notify=true
          - com.df.scrapeNetwork=proxy_proxy
          - com.df.scrapePort=14000
          - com.df.env=production
          - com.df.metricType=api
          - com.df.alertName=errorsRate
          - "com.df.alertAnnotations=summary=API error rate is high"
          - "com.df.alertLabels=severity=high"
          - 'com.df.alertIf=(sum(rate(http_request_duration_ms_count{code=~"^5..$$"}[1m])) / sum(rate(http_request_duration_ms_count[1m]))) > 0.05'

      environment:
        NODE_ENV: production

If we restart docker-flow-swarm-listener service, the container shows up. But if we don't restart the service, Prometheus display the following error:

Get http://10.0.11.213:14000/metrics: dial tcp 10.0.11.213:14000: connect: no route to host
thomasjpfan

thomasjpfan commented on Feb 5, 2019

@thomasjpfan
Contributor

@Mualig Which docker version are you using?

Mualig

Mualig commented on Feb 6, 2019

@Mualig

We have multiple nodes in our swarm. Monitor is always deployed on a node with docker 18.06.1-ce (this version is on the majority of the nodes), but some are on docker 18.03.1-ce and one is on docker 18.09.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @nlevee@Mualig@sguilly@thomasjpfan

        Issue actions

          Notification if container is unealthy > renew · Issue #40 · docker-flow/docker-flow-swarm-listener