Difference between /health vs /health?ready=1

Hi Roachers :slight_smile: I have read the section under health-endpoints and I want to confirm my understanding of how the two endpoints ie /health vs /health?ready=1 are different. Seems like /health is for checking if Cockroach is running and /health?ready=1 is for checking if the node is a healthy member of the cluster. To give some background, I am setting up our cluster in AWS and am using the /health?ready=1 endpoint as our ELB health check. Cockroach is being started by the user-data script with a join statement which gets IPs from the ASG. Before my automation can run init, it needs to check that cockroach is running on the nodes. Seems like I can use the /health endpoint for checking that cockroach is running. Please let me know if I have understood correctly or if I am missing more information.

Hi @fat0,

It’s definitely best to use /health?ready=1 for your load balancer’s health check, but for your startup script, you can actually use either. If a node is not running at all, either endpoint will give you a connection refused error:

~$ curl http://localhost:8080/health
curl: (7) Failed to connect to localhost port 8080: Connection refused
~$ curl http://localhost:8080/health?ready=1
curl: (7) Failed to connect to localhost port 8080: Connection refused

If the node were stated but not yet initialized, I’d get the following responses:

~$ curl http://localhost:8080/health
{
  "nodeId": 0,
  "address": {
    "networkField": "",
    "addressField": ""
  },
  "buildInfo": {
    "goVersion": "go1.10.1",
    "tag": "v2.1.0-alpha.20180604-180-g74705ae963-dirty",
    "time": "2018/06/04 19:31:11",
    "revision": "74705ae96394cd6698253cf9d681e5d458aa0726",
    "cgoCompiler": "4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.1)",
    "cgoTargetTriple": "x86_64-apple-darwin17.5.0",
    "platform": "darwin amd64",
    "distribution": "CCL",
    "type": "development",
    "channel": "unknown",
    "dependencies": null
  }
}~$ curl http://localhost:8080/health?ready=1
{
  "error": "node is not ready",
  "code": 14
}

Hope that helps. Let me know if you have additional questions.

Best,
Jesse

1 Like

Thank you :slight_smile:

Your welcome, @fat0. Just reach out when you have more questions.

1 Like

@jesse, I should have checked this earlier, even though the health?ready=1 endpoint prints a json in both scenarios, the response codes are different. Adding more details in CLI or API to query before removing more nodes in case anyone else is wondering.