Hi Roachers I have read the section under health-endpoints and I want to confirm my understanding of how the two endpoints ie
/health
vs /health?ready=1
are different. Seems like /health
is for checking if Cockroach is running and /health?ready=1
is for checking if the node is a healthy member of the cluster. To give some background, I am setting up our cluster in AWS and am using the /health?ready=1
endpoint as our ELB health check. Cockroach is being started by the user-data script with a join statement which gets IPs from the ASG. Before my automation can run init
, it needs to check that cockroach is running on the nodes. Seems like I can use the /health
endpoint for checking that cockroach is running. Please let me know if I have understood correctly or if I am missing more information.
Hi @fat0,
It’s definitely best to use /health?ready=1
for your load balancer’s health check, but for your startup script, you can actually use either. If a node is not running at all, either endpoint will give you a connection refused error:
~$ curl http://localhost:8080/health
curl: (7) Failed to connect to localhost port 8080: Connection refused
~$ curl http://localhost:8080/health?ready=1
curl: (7) Failed to connect to localhost port 8080: Connection refused
If the node were stated but not yet initialized, I’d get the following responses:
~$ curl http://localhost:8080/health
{
"nodeId": 0,
"address": {
"networkField": "",
"addressField": ""
},
"buildInfo": {
"goVersion": "go1.10.1",
"tag": "v2.1.0-alpha.20180604-180-g74705ae963-dirty",
"time": "2018/06/04 19:31:11",
"revision": "74705ae96394cd6698253cf9d681e5d458aa0726",
"cgoCompiler": "4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.1)",
"cgoTargetTriple": "x86_64-apple-darwin17.5.0",
"platform": "darwin amd64",
"distribution": "CCL",
"type": "development",
"channel": "unknown",
"dependencies": null
}
}~$ curl http://localhost:8080/health?ready=1
{
"error": "node is not ready",
"code": 14
}
Hope that helps. Let me know if you have additional questions.
Best,
Jesse
1 Like
Thank you
@jesse, I should have checked this earlier, even though the health?ready=1
endpoint prints a json in both scenarios, the response codes are different. Adding more details in CLI or API to query before removing more nodes in case anyone else is wondering.