Cockroach server seems to disappear

hello - i am trying to put together a demo for our company using this humble command:

/usr/local/bin/cockroach start --port=26257 --http-host=localhost --certs-dir=./path-to-certs-directory --background ;

but after awhile, the cockroach-server seems to disappear. i can fix the problem by reissuing the command.

log below:

gossip server (0/3 cur/max conns, infos 0/0 sent/received, bytes 0B/0B sent/received)
I180806 21:10:56.338573 96 server/status/runtime.go:219 [n1] runtime stats: 194 MiB RSS, 96 goroutines, 77 MiB/67 MiB/159 MiB GO alloc/idle/total, 87 MiB/96 MiB CGO alloc/total, 63.49cgo/sec, 0.01/0.00 %(u/s)time, 0.00 %gc (1x)
I180806 21:11:06.337489 96 server/status/runtime.go:219 [n1] runtime stats: 194 MiB RSS, 95 goroutines, 83 MiB/62 MiB/159 MiB GO alloc/idle/total, 87 MiB/96 MiB CGO alloc/total, 57.01cgo/sec, 0.01/0.00 %(u/s)time, 0.00 %gc (0x)
I180806 21:11:16.337458 96 server/status/runtime.go:219 [n1] runtime stats: 194 MiB RSS, 95 goroutines, 89 MiB/56 MiB/159 MiB GO alloc/idle/total, 87 MiB/96 MiB CGO alloc/total, 65.20cgo/sec, 0.01/0.00 %(u/s)time, 0.00 %gc (0x)
I180806 21:11:26.337465 96 server/status/runtime.go:219 [n1] runtime stats: 194 MiB RSS, 95 goroutines, 95 MiB/51 MiB/159 MiB GO alloc/idle/total, 87 MiB/96 MiB CGO alloc/total, 52.00cgo/sec, 0.01/0.00 %(u/s)time, 0.00 %gc (0x)
I180806 21:11:36.337423 96 server/status/runtime.go:219 [n1] runtime stats: 195 MiB RSS, 95 goroutines, 102 MiB/45 MiB/159 MiB GO alloc/idle/total, 87 MiB/96 MiB CGO alloc/total, 58.50cgo/sec, 0.01/0.00 %(u/s)time, 0.00 %gc (0x)
I180806 21:11:46.337397 96 server/status/runtime.go:219 [n1] runtime stats: 195 MiB RSS, 95 goroutines, 108 MiB/39 MiB/159 MiB GO alloc/idle/total, 87 MiB/96 MiB CGO alloc/total, 56.10cgo/sec, 0.01/0.00 %(u/s)time, 0.00 %gc (0x)
I180806 21:11:56.336302 93 gossip/gossip.go:488 [n1] gossip status (ok, 1 node)
gossip client (0/3 cur/max conns)
gossip server (0/3 cur/max conns, infos 0/0 sent/received, bytes 0B/0B sent/received)
I180806 21:11:56.336387 93 gossip/gossip.go:333 [n1] NodeDescriptor set to node_id:1 address:<network_field:“tcp” address_field:“cockroach:26257” > attrs:<> locality:<> ServerVersion:<major_val:2 minor_val:0 patch:0 unstable:0 >
I180806 21:11:56.339038 96 server/status/runtime.go:219 [n1] runtime stats: 195 MiB RSS, 96 goroutines, 114 MiB/33 MiB/159 MiB GO alloc/idle/total, 87 MiB/96 MiB CGO alloc/total, 223.86cgo/sec, 0.01/0.00 %(u/s)time, 0.00 %gc (0x)
E180806 21:11:56.603432 156 storage/queue.go:862 [n1,replicate] 26 replicas failing with “0 of 1 store with attributes matching []; likely not enough nodes in cluster”
I180806 21:12:06.337092 92 storage/store.go:4516 [n1,s1] sstables (read amplification = 1):
6 [ 8M 1 ]: 8M
I180806 21:12:06.337209 92 storage/store.go:4517 [n1,s1]
** Compaction Stats [default] **
Level Files Size Score Read(GB) Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop

L0 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.1 33 3 10.951 0 0
L6 1/0 8.27 MB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 5.8 11.2 10.7 1 2 0.720 45K 21K
Sum 1/0 8.27 MB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 8.7 0.5 0.5 34 5 6.859 45K 21K
Int 0/0 0.00 KB 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 0 0.000 0 0
Uptime(secs): 10823.8 total, 600.0 interval
Flush(GB): cumulative 0.002, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 0.02 GB write, 0.00 MB/s write, 0.02 GB read, 0.00 MB/s read, 34.3 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count

estimated_pending_compaction_bytes: 0 B
I180806 21:12:06.337611 96 server/status/runtime.go:219 [n1] runtime stats: 195 MiB RSS, 95 goroutines, 120 MiB/27 MiB/159 MiB GO alloc/idle/total, 88 MiB/96 MiB CGO alloc/total, 62.11cgo/sec, 0.01/0.00 %(u/s)time, 0.00 %gc (0x)
I180806 21:12:16.337505 96 server/status/runtime.go:219 [n1] runtime stats: 195 MiB RSS, 95 goroutines, 127 MiB/21 MiB/159 MiB GO alloc/idle/total, 88 MiB/96 MiB CGO alloc/total, 59.80cgo/sec, 0.01/0.00 %(u/s)time, 0.00 %gc (0x)

@edwardsmarkf, can you send over the full logs from the node? Also, to clarify: you’re able to access your one-node cluster, and after a period of time the cockroach process stops? When you say ‘disappear’, you mean you can’t access the cluster locally or through the admin UI? And this is the tail of the most recent log following the process stopping?

hello tim -

i am a serious newbie here. when the node-cluster (is that the right terminology?) starts, i am able to access it using this command:

/usr/local/bin/cockroach sql --user=feathersuser --database=bank --certs-dir=./path-to-certs-directory/ ;

and i can see the starting of the node-cluster using this extremely high-tech method: ps aux | grep cockr;
giving me this result:
root 2187 0.7 23.2 373384 140172 ? Sl 14:51 0:16 /usr/local/bin/cockroach start --port=26257 --http-host=localhost --certs-dir=./path-to-certs-directory

i am happy to share the log with you, although your forum here does not seem to allow for log upload (??) - i get this message back:
(authorized extensions: jpg, jpeg, png, gif, pdf, svg, eot, woff, woff2, ttf).

D’oh. A dropbox link would be fine for the logs if you can’t upload to the forum directly.

So cockroach start is the command that will create a cockroach node. Each node typically joins a cluster (via the --join parameter). It looks like you’re running a single node here, which is fine for a demo as long as you don’t want to demonstrate our fault tolerance capabilities. If you wanted to do that, you’d need to create at least three nodes (i.e.: run cockroach start on at least three ports, either locally or on separate machines that can connect to each other).

There’s actually some really handy training that’ll walk you through creating a 3 node local cluster, generating data, and demonstrating our fault tolerance capabilities step by step in the ops basics section of our training wiki.

tim - this is just for a VERY SIMPLE demonstration to try to move the powers-that-be from mysql/oracle/mariadb to cockroach.

i have no doubt that this is not even scratching the SCRATCH of the surface!

please share dropbox link with me.

also, exercise caution when using ‘D’oh’ – some of us take Homer very seriously.

or convert to pdf? does that work?
cockroach.cockroach.root.2018-08-06T16_56_02Z.022639.pdf (110.8 KB)
cockroach.cockroach.root.2018-08-06T18_11_42Z.023379.pdf (192.8 KB)

Hey @edwardsmarkf, pdf is fine. It looks like you’re running the node with a very small amount of memory (512 mb). This is far below our recommendation of 2GB, and much less than what we test internally. It’s very likely with such a small amount of RAM that the kernel ran out of memory and killed the process, in which case we might not get log rows. You can confirm this in system logs. The first couple answers here contain pointers about the OOM killer and how you can check system logs.

LOL! well i guess that’s what happens when you run in the google-vm instance and try to save money:

Resize instance

This instance has had high CPU and memory utilization recently. Consider switching to the machine type: g1-small (1 vCPU, 1.7 GB memory). Learn more

We recommend n1-standard or n1-highcpu instances: https://www.cockroachlabs.com/docs/stable/recommended-production-settings.html#cloud-specific-recommendations

Those are what we use for internal testing. In production, you’d need at least three instances to ensure availability.