CockroachDB on OpenBSD

Has anyone succeeded in running CockroachDB on OpenBSD? I tried to compile it on OpenBSD-current (AMD64) but get the following error:

gmake: *** [Makefile:177: install] Error 2

Note: Although I have Go 1.8 installed the version check fails, so I commented out the check.

Hi @mrijkeboer! Thanks for reporting. Sorry you hit this issue.

We’ve compiled CockroachDB successfully on FreeBSD. I’m not aware of anyone who’s compiled on OpenBSD, but someone else here at Cockroach Labs might.

Would you mind to tell me what version of CockroachDB you’re running so I can pinpoint the line in the Makefile at that version? If you’re compiling from Git, the commit SHA would be great; otherwise I just need the version you downloaded the source tarball for.

It’s also surprising the Go version detection failed; that detection is happy as long as it sees “go1.8” somewhere in the output of go version. Could you post the output of go version on your machine?

Hi @benesch,

I’ve downloaded (today) the cockroach-latest.src.tgz from https://binaries.cockroachdb.com/cockroach-latest.src.tgz. I’m not sure what the exact version number is.

The output of go version:

go version go1.8 openbsd/amd64

Ok, great, you downloaded the latest beta (beta-20170330). The line that failed is the go install command itself, which isn’t very helpful. You didn’t happen to see any other error messages, did you? Otherwise I’ll try to reproduce on an OpenBSD VM later today.

Sorry I haven’t seen other error messages. For the record I used OpenBSD-current (AMD64).

Sorry for the radio silence! OpenBSD unfortunately isn’t one of our company priorities at the moment, so getting this up and running has been a nights-and-weekends kind of deal.

Good news, though: I have a branch (which seems likely to merge) that produces a working binary on OpenBSD! You’ll need the latest GCC and G++ from the ports tree. Like you, I’m using an OpenBSD-current VM. Not sure how this would fare on older versions.

cd $GOPATH/src/github.com/cockroachdb/cockroach
git remote add benesch git@github.com:benesch/cockroach.git
git fetch benesch
git checkout stdmalloc
CC=egcc CXX=eg++ gmake build TAGS=stdmalloc

The final piece of the puzzle, as you can see above, was building without jemalloc. There’s a standing bug in jemalloc that causes deadlocks on OpenBSD; the Cockroach binary will build successfully but hang instantly if you link it against jemalloc.

@mrijkeboer, please give my branch a shot and report back your findings!

Also, if you or anyone else have advice on fixing the jemalloc bug, I’m all ears. Unfortunately I won’t have time to dig in myself—since we’re unlikely to recommend OpenBSD for production deployments in the near future, disabling jemalloc seems like a perfectly reasonable workaround. I’d start by hunting around with RocksDB itself: I couldn’t convince its build system to use -pthread instead of -lpthread, which allegedly fixes the bug.

@benesch, First of all, thanks for taking the time for this. Unfortunately it doesn’t seem to work on my system. I get the following error:

cmd github.com/mattn/goveralls [built] cmd github.com/mdempsky/unconvert [built] cmd github.com/mibk/dupl [built] cmd github.com/opennota/check/cmd/varcheck [built] cmd github.com/robfig/glock [built] cmd github.com/stripe/safesql [built] cmd github.com/tebeka/go2xunit [built] cmd github.com/wadey/gocovmerge [built] cmd golang.org/x/tools/cmd/goimports [built] cmd golang.org/x/tools/cmd/stringer [built] cmd honnef.co/go/simple/cmd/gosimple [built] cmd honnef.co/go/staticcheck/cmd/staticcheck [built] cmd honnef.co/go/unused/cmd/unused [built] touch .bootstrap go1\.7.* required (see CONTRIBUTING.md): go version go1.8.1 openbsd/amd64 gmake: *** [Makefile:185: .go-version] Error 1

OpenBSD version:
$ sysctl kern.version kern.version=OpenBSD 6.1-current (GENERIC.MP) #50: Thu May 4 11:52:48 MDT 2017 deraadt@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Go version:
$ pkg_info |grep '^go' go-1.8.1 Go programming language

Any suggestions on what I’m doing wrong?

Hmm, looks like you’ve checked out a very out-of-date version of CockroachDB that’s looking for Go 1.7 instead of Go 1.8.

I merged my stdmalloc branch last night, though, so as of today you should be able to use the tip of master for a successful compile (using make TAGS=stdmalloc). Could you give that a try?

Something like the following should do the trick:

git checkout master
git pull git@github.com:cockroachdb/cockroach.git master
CC=egcc CXX=eg++ gmake build TAGS=stdmalloc

Probably a PEBKAC problem at my side. With your new instructions it compiles now :slight_smile:

$ cockroach version Build Tag: 0f59bf400 Build Time: 2017/05/06 17:13:15 Distribution: CCL Platform: openbsd amd64 Go Version: go1.8.1 C Compiler: gcc 4.9.4 Build SHA-1: 0f59bf40038aa01c0ab25373ffd7980cc91e4d2e Build Type: development

To start the cluster I had to increase sysctl kern.maxfiles to 20000 and openfiles limit to 15000. Unfortunately I can’t connect to the cluster, I’m getting the following error:

$ cockroach sql --insecure --host=localhost Error: unable to connect or connection lost. `` Please check the address and credentials such as certificates (if attempting to communicate with a secure cluster). `` EOF Failed running "sql"

However netstat shows that all nodes are listening:
$ netstat -na |grep LISTEN |grep 2625 tcp 0 0 127.0.0.1.26257 *.* LISTEN tcp 0 0 127.0.0.1.26258 *.* LISTEN tcp 0 0 127.0.0.1.26259 *.* LISTEN

Any suggestions on what I’m doing wrong? I’m now running with the following ulimits:

$ ulimit -a time(cpu-seconds) unlimited file(blocks) unlimited coredump(blocks) unlimited data(kbytes) 1572864 stack(kbytes) 4096 lockedmem(kbytes) 2696464 memory(kbytes) 8086804 nofiles(descriptors) 15000 processes 256

What happens when you run the following?

$ cockroach start --insecure --logtostderr

I see the usual startup messages, followed by nearly never-ending lines that look like this:

I170507 08:44:57.330318 112 vendor/google.golang.org/grpc/clientconn.go:806  grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup openbsd.my.domain on 192.168.8.1:53: no such host"; Reconnecting to {openbsd.my.domain:26257 <nil>}

Not good! Looks like something is having trouble resolving the hostname. As a workaround, if I instead run

$ cockroach start --insecure --host=localhost --logtostderr

the cluster is happy. This does mean that the cluster will only accept connections from localhost—i.e., you won’t be able to connect from another machine. I’ll look into this shortly; meanwhile, let me know if the workaround also solves the problem for you!

If I run cockroach start --insecure --logtostderr I get the following:

I170508 16:12:33.469749 187 vendor/google.golang.org/grpc/clientconn.go:806  grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp [::1]:26259: getsockopt: connection refused"; Reconnecting to {localhost:26259 <nil>}

When I run cockroach start --insecure --host=localhost --logtostderr I get the following:

I170508 16:18:46.567044 191 vendor/google.golang.org/grpc/clientconn.go:806  grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp [::1]:26259: getsockopt: connection refused"; Reconnecting to {localhost:26259 <nil>}

For the record, I’m running as a normal user (not root) and using a /etc/hosts file containing the following:

127.0.0.1       server.domain.com server localhost
::1             localhost
127.0.1.1       server.domain.com server

Looks like your cluster is remembering an old node at localhost:26259 and trying to reconnect. Easiest solution is to nuke the existing cluster data, assuming you didn’t store anything important in your cluster. So rm -rf cockroach-data and try again!

I’ve nuked the cockroach-data directory. If I run cockroach start --insecure --logtostderr I now get the following:

W170508 16:50:41.250887 21 server/status/runtime.go:184  [n1] unable to get file descriptor usage (will not try again): not implemented on openbsd
W170508 16:52:45.667141 287 server/server.go:600  [n1,client=127.0.0.1:48957] failed to set TCP keep-alive duration for pgwire: set tcp 127.0.0.1:26257->127.0.0.1:48957: protocol not available

When I run cockroach start --insecure --host=localhost --logtostderr I get the following:

W170508 16:56:37.768362 143 server/status/runtime.go:184  [n1] unable to get file descriptor usage (will not try again): not implemented on openbsd
W170508 16:57:24.417908 551 server/server.go:600  [n1,client=127.0.0.1:12563] failed to set TCP keep-alive duration for pgwire: set tcp 127.0.0.1:26257->127.0.0.1:12563: protocol not available

However, in both cases I can’t connect with cockroach sql --insecure although netstat indicates that cochroach is listening.

Netstat output in former case:

$ netstat -na |grep LISTEN |grep 2625 
tcp          0      0  *.26257                *.*                    LISTEN

Netstat output in latter case:

$ netstat -na |grep LISTEN |grep 2625
tcp          0      0  127.0.0.1.26257        *.*                    LISTEN

I’ve seen both of those warnings in my logs as well but they haven’t affected my ability to connect. Specifically, the file descriptor usage warning reproduces consistently; I only see the TCP keep-alive warning when bootstrapping a cluster (i.e., when the cockroach-data directory already exists). I’m always able to connect, though, no matter whether those warnings are logged.

I suppose it could be your /etc/hosts configuration. If you use --host=127.0.0.1 everywhere you’re currently using --host=localhost, do you have any luck?

Unfortunately that doesn’t work either. I’ve tried with adding --host=127.0.0.1 and I have also tried --host=localhost and --host=127.0.0.1 after deleting /etc/hosts. When I try to connect with telnet 127.0.0.1 26257 I do get a connection, but off-course I don’t speek the protocol.

However, I do think the following log message could have something to do with it:

E170509 09:13:50.500314 511 server/server.go:625  [n1,client=127.0.0.1:17392] unable to pre-allocate 21504 bytes for this connection: root: memory budget exceeded: 112640 bytes requested, 0 bytes in budget

It appears at the same time as the failed login attempts.

@knz, do you have any insight here?

Yes the latter error reveals a problem. My guess is that the gosigar package that we use ( https://github.com/elastic/gosigar ) may not be doing its job properly on OpenBSD and doesn’t report the right amount of free memory. This is just a hypothesis however. What I would really like to see is a GitHub issue where you provide us these details. In particular, run your node with --vmodule=mem_usage=2, experience the client error, then provide in the github issue the full log file of the node that the client is trying to connect to, with the mem_usage log details included. I’ll have a look.

(Regarding the earlier network error: I think the issue here is dead simple: OpenBSD supports a dual ip4/ip6 stack, and by default the name “localhost” resolves to ipv6. Using --host=127.0.0.1 on both client and server should be sufficient to make the situation simple, irrespective of what you have in /etc/host (which you should not delete, by the way))

@knz I’ve run the node with --vmodule=mem_usage=2 and placed the output in github issue 15836.

For the record: I didn’t actually delete /etc/hosts I just renamed it temporary to /etc/hosts.bak.

As explained in https://github.com/cockroachdb/cockroach/issues/15836 there is indeed an issue in CockroachDB, and the workaround is to specify --max-sql-memory and --cache explicitly.

Also thank you for filing https://github.com/cockroachdb/cockroach/issues/15840 where you explain how to fix the issue with the maximum number of open files on OpenBSD.

Thank you for helping us investigate this. Feel free to mark this thread as resolved if that’s how you see the situation.

To summarize for people finding this thread, these are the steps to install CockroachDB on OpenBSD-current:

1: Increase system-wide file descriptor limit:

# echo "sysctl kern.maxfiles=15000" >> /etc/sysctl.conf

2: Increase user file descriptor limit:

  • Open /etc/login.conf with a text editor (as root).
  • Add :openfiles=15000:\ to the users login class.
  • Reboot.

3: Compile CockroachDB:

$ export GOPATH=<some path>
$ mkdir -p $GOPATH/src/github.com/cockroachdb
$ cd $GOPATH/src/github.com/cockroachdb
$ git clone https://github.com/cockroachdb/cockroach.git
$ cd cockroach
$ git checkout master
$ CC=egcc CXX=eg++ gmake build TAGS=stdmalloc

Or replace the last line with the following for the open-source version:

$ CC=egcc CXX=eg++ gmake buildoss TAGS=stdmalloc

4: Install CockroachDB binary:

# install -o root -g bin -m 0755 $GOPATH/src/github.com/cockroachdb/cockroach/cockroach /usr/local/bin/cockroach

5: Run CockroachDB:

$ cockroach start --insecure --host=127.0.0.1 --background

Because the physical RAM detection of gosigar doesn’t currently work on OpenBSD the --max-sql-memory and --cache flags can be used to assign physical RAM to CockroachDB. The recommended value for each flag is 25% of the total physical RAM. When these flags are omitted the default of 512MiB is used.

1 Like