This topic is about using CockRoachDB within the HealthCare sector; and I have several questions.
For each question I will try to explain the use case to my best of my ability.
#1 How to query the range fragmentation. (Fragmentation)
I would like to know how I can query the range fragmentation in relation to a interleaved table configuration.
Setup: Patient (Parent) -> Child tables -> Child Tables; examples of child tables; orders, laboratory results, documents etc…
The current default range is 64mb.
I would like to answer the following question:
To which size do I need to increase the range size, because I know for a fact that the majority of patients use more than a single range.
Therefor I would like to query the database and get a report back which can tell me the 90% would fit into a range size of 256MB. Preferably I would like to be able to give a threshold within the query. The idea behind this calculation is that it will calculate over the entire interleaved hierarchy for the patient.
CALCULATE RANGE_SIZE OVER
patient THRESHOLD 90;
Total Rows: 1,859,000 Rows
| Threshold | Size |
| 90 | < 164 MB |
Which would tell me that I need to increase my range size to 256 MB.
Question: Is this possible; If not any chance something like this can become a feature ?
#2 Does increasing the range size also reorder the data ?
When I have patient data fragmented over several ranges. Does the current auto re-balancing after increase the range size also reorder the ranges and tries to put the patient data back into a single range as much as possible, like when it does when storing patient data and there is still space left in the current range. I could not find any information about this within the docs.
#3 Create Range on new patient INSERT and reserving the range for a single patient.
Is it possible to tell the parent interleaved table to create a new range when a new patient is inserted into the top level patient table. This would include telling cockroach to keep this entire range space reserved for this patient only. And when the range is full it would simply create a new range also belonging to this patient.
Imagine the following scenario. (This is also linked to #1). 90% of the patient data would fit into 256MB. By telling the top interleaved table to create a new range for each patient. All data for a patient would be kept into a single range. Which would make sure there is always high performance.
The other 10% can be moved to a separate nodes with its own replication zone for patient which consists for a large dataset. (It is unclear to me if this would be possible even with the enterprise license) because I don’t want to pin a row to a specific replication zone but a row with all interleaved child data into a range on a set of nodes in which the range size would be about 1GB for this specific replication zone only, which allows me to store PET-, CT-, MRI-Scans, and still provide me with a lot of performance.
Hopefully you can see that these items are all interconnected; hope anyone can provide some answers or some insight.
Thanks in advance.