Why does CRDB leverage etcd.RawNode instead of etcd.Node to implement Multi-Raft?

replication

(Liang) #1

I am going to implement Multi-Raft and have gone through some open source implementations, one of which, CRDB, uses etcd.RawNode to implement Multi-Raft. But I do not know why it is RawNode, instead of Node that is used here. I found an issue comparing them in etcd repo, but it is still open and no answer (https://github.com/etcd-io/etcd/issues/4932). So here is my question.

Why etcd.RawNode is used in CRDB to implement Multi-Raft? what it the benefits of using RawNode instead of Node?

Thanks for your reading and I am looking forward to your reply!


(Ben Darnell) #2

Each raft.Node creates its own goroutine (and multiple channels, etc). That’s pretty expensive when you aim to support hundreds of thousands of raft groups per process, so that’s why we created and use raft.RawNode. RawNode is thread-unsafe by design and relies on external synchronization, so it’s much cheaper when you have a lot of them.