Should I bother creating more than 1 cluster?

Hi everyone, I’m building a cloud service for an application.

It is my plan to run all the databases on one large cluster (on digital ocean). Each of my users could have multiple small databases on the cluster.

Does it matter that there can be millions (even billions) of small 10 or 100 mb? databases ? The names of the databases would be unique hashes

Should I bother creating more than 1 cluster? No one will ever need to manually administer any of the databases or tables. Some of them may be small, some big. but mostly small. my plan was to just specify the database to use with each query.

besides that, I was hoping it would just work :wink:

Am I creating a bottleneck of some kind?

Thanks for any thoughts.

Hi @makeshyft-tom,

There’s nothing intrinsic about Cockroach that would prevent your idea from working. However, you might not see the performance that you expect - it depends on the workload.

Could you share more details about the expected workload that your service will encounter? What will the user databases be for? What is your reason for sharding the cluster by database instead of having a large database that has a user column?

Thanks,
Jordan

thanks for your reply.

I think I was considering creating smaller clusters because the data my users produce on their local machine, when submitted, is combined with other user’s data and inserted into one database, that is then accessible to all the user’s whose data has been combined.

some databases will be a combination of 2 users … some 10 … 100, some 10,000 and so on…

Since the databases are not “related” to eachother and immutable once created, each database on the server is literally “alone” and completely self contained. Once created, they will be-read only by the (number of) users that combined to make the file.

So the load on each database is going to vary. some could be 200 records in 5 tables and some 10,000 records in 8 tables.

So I thought maybe if I’m going to have millions of separate databases on the same cluster, maybe making different clusters that host x number of databases could avoid some problems down the road.

The thought was that I can have 1 cluster for databases names that start with a-d, or 1-4, and so on …

I hope I explained my use case…I think I’m going to start with 1 cluster and then reconsider if I notice any problems. Moving data seems trivial using cockroach DB

Thank you for any further thoughts on this.

@makeshyft-tom,

Sorry for the long delay in response. I would caution that we haven’t tested with very large numbers of databases and tables before. You may run into problems, since every table will get a fresh range.

I think you’re right that having several clusters might serve you better in the near term.

Jordan

thanks for the warning … we have a ways to go before we have to worry about it … but i am a forward thinking kinda guy … i appreciate the help no worries about time.