Network of stores in a country, does CRDB fit the use case?

Hi everybody, I learned about CockroachDB from an interview in the FLOSSweekly podcast, and I am wondering whether it would be suitable for the use case I am dealing with. The intention is to leverage its automagic replication capabilities and thus avoid writing the software that will do the synchronization of a conventional DB.

Before I state the questions, here is a description of the scenario, so you understand the nature of the problem at hand:

  • Imagine a network of 300 mobile booths distributed across a city.
  • The booths use a 3G modem to get a connection to the Internet.
  • The communication infrastructure in the city is not perfect, there are areas with poor coverage, so it is expected that occasionally some of the booths will be offline for periods ranging between several seconds and several minutes.
  • Customers are registered in the system and they can make purchases in any of the booths, using their credit points within the system, by typing their username and password.
  • There is a main database that has the data about the credit balance of all the accounts.
  • Each booth has a small computer with a data copy in a local database, to ensure that it can process a transaction even when it is offline.
  • The local DB can be slightly outdated, but as stated above, it is out of sync within a range of several minutes, which is acceptable.

The desired effect is that when the booth comes back online, it will tell the main DB about the transactions that were conducted while it was offline, and the main DB will then propagate the update to all the other booths.

Note on security: assume all booths are secure and we are not worried about the possibility of an enemy getting the local copy, or a malicious merchant tinkering with it.

My questions are:

  1. Is CRDB able to handle such a use case in a transparent way? i.e., the software talking to the DB will have the illusion that it talks to a local instance, without awareness of all the other nodes and their synchronization?

  2. How would CRDB’s terminology apply in this case? Is each booth considered a “zone”, thus we’d have 300 zones? (In the interview it was stated that at the moment there’s a deployment of CRDB with 500 nodes - so I thought it would be plausible).

I thank you for your time to help me with getting acquainted with CRDB and I look forward to feedback.

Hi Alex,

thank you for your inquiry. Your use case definitely sounds interesting.

Maybe a point that needs clarification before we can provide more advice: what do you expect to happen if a customer walks from one (temporarily disconnected) booth to another and makes a second purchase with their credit points?

Regarding your specific questions?

As of this time CockroachDB will not be able to deal with your use case as a single distributed cluster. This is because any query requires at least local availability of a couple of system ranges.

In general, the moment you present CockroachDB with a network partition (where a part of the system is disconnected from the rest), either one side or the other must stop processing transactions that need data from the other side. Because system ranges are needed for all transactions, that means that a persistent partition will cause full unavailability of either of both sides.

Therefore, for your use case, using CockroachDB would require the use of a temporary staging database on each mobile booth, and a reconciliation with the online cluster when connectivity is resumed.

If you had a way to avoid network partitions, then yes it would probably be reasonable to define one zone per booth, or perhaps one zone per area of the city.