Can CockroachDB take advantage of GPU cores? I mention this because for some large multi-TB use cases, the number of required CPU cores becomes very high. For example, this Cloudera Impala guide indicates 800 nodes required for a 60 TB database and 100 concurrent queries. In their example, this equates to 51.2 TB memory & 9,600 cores. This would be fairly cost prohibitive with traditional CPUs but the NVIDIA GTX 1080 has 2,560 cores…
The compute power needed for a SQL database is a combination of processing cores and also I/O bandwidth with all 3 essential resources: RAM, disks and the network.
GPUs only offer high I/O bandwidth towards RAM; bandwidth towards disks and network is very constrained (due to the data having to move from GPU RAM to network/disk via the main processor).
Trying to run a SQL database onto 2000 GPU cores would be like trying to feed power to 2000 computers through a single power supply. Only a few would have enough data to work with.
(Not to mention the fact that there are no programming languages that help implementing network services, like a database server, over a GPU architecture. This is the one single major reason why GPUs are not used more, even when the architecture would be otherwise suitable performance-wise. Go blame NVidia for keeping their architecture details proprietary, preventing the development of good open source compilers.)
Also consider that a database running on a single GPU, no matter the number of cores, would be very vulnerable to system crashes, power outages or network problems. This is not the right way to build scalable, resilient data stores.
That type of approach is basically brute force throw hardware at the problem of largely unindexed, often unstructured data and do full table scans for every query instead of structuring a database. I get over 3 million QPS in a 40TB database sharded over 19 4 core VMs. (That is using mysql (not sure how cockroach would compare, probably slower than mysql, but still orders of magnatudes faster than something like Impala), but shows the difference of a true database and a pile of data doing map reduce). The sharding in my case is more to keep downtime to a reasonable time than a problem running that on a single machine…
There are ways to write generic code for GPUs, but that is not the bottleneck. You might be able to gain some performance the the brute force map reduce running in a GPU, but the bottleneck is the IO unless you fit all that in GPU memory. Note the memory example of 51.2TB … how many GPUs is it going to take to get 51TB of memory? Without doing so, your bottleneck, even if pcie ssd is going to be I/O and so the GPUs will not help much.