50k databases: does this work?


(Paul) #1

Hello,

I’m a newbie and try to figure out, if CR does fit to our use case.

How does CR perform with a large number of (relatively) small databases? Like 50k databases, 30 tables each?

Thanks,

Paul


(Ron Arévalo) #2

Hey @paul

We’d like to know a bit more about your use case. What is the reason behind splitting your data in this way?

I am assuming that you are wanting to split this data into different tables based on some property like an app or user id. If that is the case then I would suggest creating 1 table and creating an index on that property, if you go that route you’ll want to include the indexed column whenever you query the table.

Does that help? Let me know if you have any more questions.

Thanks,

Ron


(Paul) #3

Hey Ron,

thanks for the quick reply.

No, we need a different database structure (tables, fields) for every customer. So that every customer gets it’s own, customized database.

Thanks,

Paul


(Ron Arévalo) #4

Hey @paul,

At the moment we are not optimized for such a large number of DB’s, but we would interested to see the outcomes of any testing that you may do. Depending on the schema and the characteristics of your workload, you may find that CRDB performs at an acceptable level. If not, more detail about your DDL and the results of your testing will help us improve the product down the road.

Thanks,

Ron


(Paul) #5

Hey @ronarev,

we are still in the first steps with CRDB, but it look really cool so far! Very promising product.

Right now we are adding it to our Kubernetes-based production environment, and will use it parallel to Postgres in a small part of the product.

Can you tell a bit more about what is the limiting factor with many databases? RAM? What could slowdown the queries?


(Ron Arévalo) #6

Hey @paul

You can run into many issues when moving towards a multi-tenant database. At the moment Cockroach 2.1 has only been tested for up to 8k tables. You may start to see new sources of resource contention. Some of the known limitations deal with the number of ranges a node can support. We have a Github issue that goes into some more detail on limitation here.

Thanks,

Ron