Admin UI not connecting

Hello everyone,

I have came across a rather strange thing today. So, I have 2 droplets on digital ocean.

lon1.xxxx.com and sanfran1.xxxx.com

Last night, I set up both the servers, joined the together and everything was fine. When I woke up today I decided to make a small installer script to make CA creation and server managment a one line job, and it was working pretty well until I found myself unable to connect to the admin ui or make the 2 roach nodes connect together

I thought this might be down the toe script so I rebuilt the droplet manual install and config of everything…and I was still not able to connect, so I did it 2 more times…and now we are…I have no idea what is causing this, I have cheked firewalls, ports, addresses nad even tried it on another network with no success.

Anyone have any ideas?

  • Mark

Hi Mark,

two nodes are insufficient to tolerate a node failure in a CockroachDB cluster of more than 1 node. Only starting at 3 can CockroachDB decide there is quorum and let transactions make progress when one node disappears.

If you start a cluster of two nodes, then remove one, the admin UI cannot load any more. Could that be what happened in your setup?

You can use the following doc page for details: https://www.cockroachlabs.com/docs/recommended-production-settings.html#cluster-topology

See…this is where it starts to hurt my head…I decided to start from scratch and work my way throught the nodes.

After launching the first node, and the first node alone, I was still unable to load the admin UI, so I said screw it and launched the second node. trying to connect to the UI using that nodes address returned a 404.

I set up a third node using my own server setup and it was not even able to connect to the other nodes.

I was wanting to use cockroach as the backbone of a rather big project I am starting work on, but with this issue I am already 2 days behind, so I might have to use something else!

@apollo, I’m sorry you’ve been running into these issues. I’ve run test clusters on Digital Ocean in the past days without any trouble, so I’m interested to learn more about your setup and try to help. A few questions:

  • In terms of being able to access the admin UI, did you add a firewall rule to open up port 8080?
  • Can you share the cockroach start commands (scrubbing the ips in --host and --join`)?
  • When you start from scratch, are you removing the cockroach data directory on each droplet? Those are called cockroach-data by default.

@jesse Sure, no problem.

I had 2 firewalls set up, one named allow.all and one named allow.roach. Until the problem I was using allow.roach, allowing database and ui connections only, when the problems started, I made a new firewall that allowed all ports UDP and TCP from any source. I made sure to only have one of the firewalls enabled on the droplets at once.

When starting from using the standard cockroach commands I would have used the following.

cockroach start --certs-dir=certs --host=lon1.xxxx.com &

When joining a node

cockroach start --certs-dir=certs --host=sanfran1.xxxx.com --join=lon1.xxxx.com &

Making sure that the IP addresses resolved correctly.

Here is a link to the github to the script I wrote that I was using until these problems started

https://github.com/MooseTheCoder/roachkit

When I was starting from scratch I was completly rebuilding the droplet as cockroach was the only thing running on it.

I have taken a break from it today but might try again one last time!

Okay…

I found a rather embarrassing solution…

Im working on this project along side a student, so I told him go get me his command history…

And each time the server was reset, he would go and change the machine hostname to “xxx-xxx”

So, I changed the machine hostname back to “lon1” and now everything is working perfectly…

Please end me…

Oh, good! Glad the problem’s solved!

Just curious: Did you reference our Digital Ocean deployment tutorial when writing up your script? If so, please let me know if anything was confusing or missing.

@jesse I didn’t have a look at it but I’ll check it out and let you know if I find any difficulties