Hi, is there any benchmark result available for cockroachdb? A comparison with known alternatives would be awesome.
We don’t have any benchmark results published at this time, but are actively working on it. Do you have particular benchmarks and database systems you’d like to see numbers for?
I’m trying to test performance of a 10 node CRDB cluster against Cassandra. Do you recommend any scripts/frameworks for this purpose? What is the CRDB team using for performance analysis and benchmarking? I’m thinking something similar in spirit to the benchmarks etcd recently did.
We use a variety of load generators for performance analysis and benchmarking:
ycsb generate key-value workloads and
tpch generates queries from the TPC-H benchmark. These load generators are located in the loadgen repo. There are a handful of other load generators in the examples-go repo, such as
bank2 which simulate simple bank schemas,
block_writer which was a precursor to
photos which simulates a photo sharing/commenting workload.
We’ll definitely have a blog post or posts similar to the etcd one at some point soon. I can’t promise a particular date, though, partly because we’re still working on performance so the numbers are changing rapidly.
Very exciting; I’m trying to map out a few details regarding our team swapping Postgres for CockroachDB on one of our internal services. We’ve done some initial testing already, but just wanted to get someones personal opinion on the following:
Given the beta status of CockroachDB, how adventurous do you think it would it be to run CockroachDB in a production environment, assuming we have a failover running Postgres? (Just in terms of your general opinion based on project familiarity, e.g. “data safety issues” versus “might stall occasionally” or something inbetween)
Additionally, in terms of performance, is there anything users could provide in addition to existing telemetry that would be helpful? I didn’t see anything specific soliciting feedback, but we’d be thrilled to provide our real world use experience if it’d be useful to the project, either as a use-case on launch or for furtherance of optimizations.
Thanks in advance!
I’d consider running CockroachDB in production a medium risk right now and that risk could be modulated by how much testing you can do. Simply swapping Postgres for CockroachDB with zero testing would be very risky. But if you tested your workload/app against CockroachDB in a development/staging area then the risk is moderate. Data unavailability issues (i.e. “might stall occasionally”) are much more likely than data safety issues.
In terms of performance, we’re always looking for new workloads to use as benchmarks or load generators. I’d definitely be interested in learning about what your schema is and the types of queries performed against it in order to work up a synthetic load generator that could model the workload.
We are also considering using CockroachDB and are working on testing it and later running it in production. What are the main issues leading to “moderate risk”? We are interested in the nature of the issues to better understand the likelihood of us being able to address them (or the time before it is fixed by others).
Got it, it’d would be a re-implementation of schema / queries as you suspect, not a direct Postgres swap. We’d just be using Postgres as a hot standby.
Thanks for your reply; We’ll update as soon as we’ve given it a shot.
@bartz The moderate risk would be data unavailability during failure scenarios. We’ve done a lot of testing of various failure scenarios and I’m confident CockroachDB won’t lose data, but there are always more scenarios out there than what you test.
In a hypothetical world, suppose if that is to happen, would one lose data from all replicated nodes in a given cluster?
@micky Not sure if I’m understanding the scenario you’re concerned about. Data unavailability is not the same as data loss. But even the data unavailability concern is much less now. If a node fails, Cockroach automatically notices and shifts load to another nodes. The most you should experience is a few second performance blip.
@micky, in terms of how CockroachDB recovers from failure, we have a simple demo you can run through with a local cluster, if you like: https://www.cockroachlabs.com/docs/demo-fault-tolerance-and-recovery.html
I was referring to the worst case scenario. But @jesse, your reply does address my concern
P.S. I already went and looked that up in the docs. Glad to find out that the zoning is extensively configurable.
Just wanted to give you a quick thank you for your response a couple months back.
We’re currently running CockroachDB 1.0 and are having a great experience thus far. Performance is actually far above what we expected at this stage, with tables containing a few hundred million rows each. It’s also worth noting that even in beta, we had no issues whatsoever.
We aren’t yet at a stage where our CockroachDB deployment is large enough to benefit from Enterprise, so all I can offer is praise and recommendations to others. With that said I wanted to make sure you (and readers of this thread) were aware how it turned out for us.
Delightful, @travis. Keep us posted in the future.
There’s so much work on performance improvements remaining; expect the roach to get better.
Thanks for the update @travis and for spreading the word.
@travis So glad to hear that you are having a good experience! What would be very helpful for us is if you could spare some time to walk through your experience with CockroachDB so far so that we can improve going forward. Let me know if you have time to do that for us.
Absolutely. We already have quite a bit of documentation of our use case and comparisons with other software in our internal feasibility study, which might be helpful to you internally or as a real world testimonial.
What’d be the best way to reach you?