One of the new features in Docker 1.12 is a health check for a container that can be baked into the image definition. This can be overridden at the command line. Just like the CMD instruction, there can be multiple HEALTHCHECK instructions in a Dockerfile, but only the last one is effective.

Why do we need it?

This is a great addition because a container reporting status as Up 1 hour may return errors. The container may be up, but there is no way for the application inside the container to provide a status. This instruction fixes that.

The Dockerfile that builds arungupta/couchbase image is:

FROM couchbase:latest

COPY configure-node.sh /opt/couchbase

HEALTHCHECK --interval=5s --timeout=3s CMD curl --fail http://localhost:8091/pools || exit 1

CMD ["/opt/couchbase/configure-node.sh"]

It uses the configure-node.sh script to configure the server using Couchbase REST API. The new instruction to notice here is HEALTHCHECK.

HEALTHCHECK instruction

This instruction can be specified as:

HEALTHCHECK <options> CMD <command>

The <options> can be:

  • --interval=DURATION (default 30s)
  • --timeout=DURATION (default 30s)
  • --retries=N (default 3)

The <command> is the command that runs inside the container to check the health.

States

If health check is enabled, then the container can have three states:

  • starting – Initial status when the container is still starting
  • healthy – If the command succeeds then the container is healthy
  • unhealthy – If a single run of the <command> takes longer than the specified timeout then it is considered unhealthy. If a health check fails then the <command> will run retries number of times and will be declared unhealthy if the <command> still fails.

The commands exit status indicates the health status of the container. The following values are allowed:

  • 0 – container is healthy
  • 1 – container is not healthy

In our instruction, /pools REST API is invoked using curl. If the command fails then an exit status of 1 is returned, and this marks the container unhealthy for that attempt. This command is invoked every 5 seconds. The container is marked unhealthy if the command does not return successfully within 3 seconds.

Check the status

Run the container as:

docker run -d --name db arungupta/couchbase:latest

Check the status:

docker ps
CONTAINER ID        IMAGE                        COMMAND                  CREATED             STATUS                            PORTS                                                        NAMES
55b14302671e        arungupta/couchbase:latest   "/entrypoint.sh /opt/"   2 seconds ago       Up 1 seconds (health: starting)   8091-8094/tcp, 11207/tcp, 11210-11211/tcp, 18091-18093/tcp   db

Notice how health: starting status is reported in the STATUS column. Checking after a few seconds shows the status:

docker ps
CONTAINER ID        IMAGE                        COMMAND                  CREATED              STATUS                        PORTS                                                        NAMES
55b14302671e        arungupta/couchbase:latest   "/entrypoint.sh /opt/"   About a minute ago   Up About a minute (healthy)   8091-8094/tcp, 11207/tcp, 11210-11211/tcp, 18091-18093/tcp   db

And now its reported healthy.

More details about this HEALTHCHECK instruction can be found on docs.docker.com.

Checking health without HEALTHCHECK

Now, if you are running an image that does not have HEALTHCHECK instruction then the docker run command can be used to specify similar values. An equivalent runtime command would be:

docker run -d --name db --health-cmd "curl --fail http://localhost:8091/pools || exit 1" --health-interval=5s --timeout=3s arungupta/couchbase

The last 5 health checks for a container can be obtained using the docker inspect command:

docker inspect --format='{{json .State.Health}}' db

The output is shown as:

{
  "Status": "healthy",
  "FailingStreak": 0,
  "Log": [
    {
      "Start": "2016-11-12T03:23:03.351561Z",
      "End": "2016-11-12T03:23:03.422176171Z",
      "ExitCode": 0,
      "Output": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Currentn                                 Dload  Upload   Total   Spent    Left  Speednr  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0r100   768  100   768    0     0   595k      0 --:--:-- --:--:-- --:--:--  750kn{"isAdminCreds":true,"isROAdminCreds":false,"isEnterprise":true,"pools":[{"name":"default","uri":"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969","streamingUri":"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969"}],"settings":{"maxParallelIndexers":"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969","viewUpdateDaemon":"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969"},"uuid":"1b84cdbd136e4e8466049dd062dd6969","implementationVersion":"4.5.1-2844-enterprise","componentsVersion":{"lhttpc":"1.3.0","os_mon":"2.2.14","public_key":"0.21","asn1":"2.0.4","kernel":"2.16.4","ale":"4.5.1-2844-enterprise","inets":"5.9.8","ns_server":"4.5.1-2844-enterprise","crypto":"3.2","ssl":"5.3.3","sasl":"2.3.4","stdlib":"1.19.4"}}"
    },
    {
      "Start": "2016-11-12T03:23:08.423558928Z",
      "End": "2016-11-12T03:23:08.510122392Z",
      "ExitCode": 0,
      "Output": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Currentn                                 Dload  Upload   Total   Spent    Left  Speednr  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0r100   768  100   768    0     0   309k      0 --:--:-- --:--:-- --:--:--  375kn{"isAdminCreds":true,"isROAdminCreds":false,"isEnterprise":true,"pools":[{"name":"default","uri":"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969","streamingUri":"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969"}],"settings":{"maxParallelIndexers":"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969","viewUpdateDaemon":"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969"},"uuid":"1b84cdbd136e4e8466049dd062dd6969","implementationVersion":"4.5.1-2844-enterprise","componentsVersion":{"lhttpc":"1.3.0","os_mon":"2.2.14","public_key":"0.21","asn1":"2.0.4","kernel":"2.16.4","ale":"4.5.1-2844-enterprise","inets":"5.9.8","ns_server":"4.5.1-2844-enterprise","crypto":"3.2","ssl":"5.3.3","sasl":"2.3.4","stdlib":"1.19.4"}}"
    },
    {
      "Start": "2016-11-12T03:23:13.511446818Z",
      "End": "2016-11-12T03:23:13.58141325Z",
      "ExitCode": 0,
      "Output": " {"isAdminCreds":true,"isROAdminCreds":false,"isEnterprise":true,"pools":[{"name":"default","uri":"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969","streamingUri":"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969"}],"settings":{"maxParallelIndexers":"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969","viewUpdateDaemon":"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969"},"uuid":"1b84cdbd136e4e8466049dd062dd6969","implementationVersion":"4.5.1-2844-enterprise","componentsVersion":{"lhttpc":"1.3.0","os_mon":"2.2.14","public_key":"0.21","asn1":"2.0.4","kernel":"2.16.4","ale":"4.5.1-2844-enterprise","inets":"5.9.8","ns_server":"4.5.1-2844-enterprise","crypto":"3.2","ssl":"5.3.3","sasl":"2.3.4","stdlib":"1.19.4"}} % Total    % Received % Xferd  Average Speed   Time    Time     Time  Currentn                                 Dload  Upload   Total   Spent    Left  Speednr  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0r100   768  100   768    0     0   248k      0 --:--:-- --:--:-- --:--:--  375kn"
    },
    {
      "Start": "2016-11-12T03:23:18.583512367Z",
      "End": "2016-11-12T03:23:18.677727356Z",
      "ExitCode": 0,
      "Output": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Currentn                                 Dlo{"isAdminCreds":true,"isROAdminCreds":false,"isEnterprise":true,"pools":[{"name":"default","uri":"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969","streamingUri":"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969"}],"settings":{"maxParallelIndexers":"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969","viewUpdateDaemon":"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969"},"uuid":"1b84cdbd136e4e8466049dd062dd6969","implementationVersion":"4.5.1-2844-enterprise","componentsVersion":{"lhttpc":"1.3.0","os_mon":"2.2.14","public_key":"0.21","asn1":"2.0.4","kernel":"2.16.4","ale":"4.5.1-2844-enterprise","inets":"5.9.8","ns_server":"4.5.1-2844-enterprise","crypto":"3.2","ssl":"5.3.3","sasl":"2.3.4","stdlib":"1.19.4"}}ad  Upload   Total   Spent    Left  Speednr  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0r100   768  100   768    0     0   307k      0 --:--:-- --:--:-- --:--:--  375kn"
    },
    {
      "Start": "2016-11-12T03:23:23.679661467Z",
      "End": "2016-11-12T03:23:23.782372291Z",
      "ExitCode": 0,
      "Output": "  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Currentn                                 Dload  Upload   Total   Spent    Left{"isAdminCreds":true,"isROAdminCreds":false,"isEnterprise":true,"pools":[{"name":"default","uri":"/pools/default?uuid=1b84cdbd136e4e8466049dd062dd6969","streamingUri":"/poolsStreaming/default?uuid=1b84cdbd136e4e8466049dd062dd6969"}],"settings":{"maxParallelIndexers":"/settings/maxParallelIndexers?uuid=1b84cdbd136e4e8466049dd062dd6969","viewUpdateDaemon":"/settings/viewUpdateDaemon?uuid=1b84cdbd136e4e8466049dd062dd6969"},"uuid":"1b84cdbd136e4e8466049dd062dd6969","implementationVersion":"4.5.1-2844-enterprise","componentsVersion":{"lhttpc":"1.3.0","os_mon":"2.2.14","public_key":"0.21","asn1":"2.0.4","kernel":"2.16.4","ale":"4.5.1-2844-enterprise","inets":"5.9.8","ns_server":"4.5.1-2844-enterprise","crypto":"3.2","ssl":"5.3.3","sasl":"2.3.4","stdlib":"1.19.4"}}  Speednr  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0r100   768  100   768    0     0   439k      0 --:--:-- --:--:-- --:--:--  750kn"
    }
  ]
}

Health Check of Docker Containers

| Cloud| 1,575 views | 0 Comments
About The Author
- Arun Gupta is the vice president of developer advocacy at Couchbase. He has been building developer communities for 10+ years at Sun, Oracle, and Red Hat. He has deep expertise in leading cross-functional teams to develop and execute strategy, planning and execution of content, marketing campaigns, and programs. Prior to that he led engineering teams at Sun and is a founding member of the Java EE team. Gupta has authored more than 2,000 blog posts on technology. He has extensive speaking experience in more than 40 countries on myriad topics and is a JavaOne Rock Star. Gupta also founded the Devoxx4Kids chapter in the US and continues to promote technology education among children. An author of a best-selling book, an avid runner, a globe trotter, a Java Champion, and a JUG leader, he is easily accessible at @arungupta.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>