Hello
-
A node in CRDB can hold many (100k+) ranges so it seems common that for any two nodes in a CRDB cluster they will both be members of the same raft range. Thus the communication overhead for replication is at least (1/hearbeat_interval_ms)xNumberOfNodes (i.e. the sum over all nodes of (1/hearbeat_interval_ms)x(NumberOfNodes-1)). And this assumes the multiraft batching - otherwise the overhead would be (1/hearbeat_interval_ms)xNumberOfRanges?
-
How does the auto-rebalancing algorithm work? Is there a preference for trying to reduce the communication overhead mentioned above? What tradeoffs are taken into consideration for rebalancing optimization?
-
Couldn’t it be possible that a transaction that spans many ranges could end up creating O(NumberOfNodes) overhead due to the first question? It seems possible that I could be interested in some dataset such that the raft leaders that owns that dataset includes every node in the cluster?
-
What’s the fundamental scaling bottleneck that keeps CRDB from scaling past 64 nodes? Is it the gossip mechanism (seems unlikely)? Is it multi-raft with O(NumberOfNodes) overhead (seems unlikely)? Is it the possibility of pathological behavior from my hypothetical third point?
Thanks,
Susan