Row Level Security with Multi-tenancy

I have a requirement of Row Level Security for the following use cases:

  1. A user belongs to a branch of the company and can access data only of that branch or a couple or all the branches according to the Role of that user.
  2. A user should also be able to create notes and share the note with any employee within the organization with Read / Update / Delete privileges.

I need some assistance on how should the Access Control should be built. Also I have built a simple layer over the database for multi-tenancy purposes, and intend to add this row level security into that layer itself.

I did think of a Group based system, where every user can be part of 1 or more groups, and all that happens when a note is shared a group is created of those 2 users. But, I am worried it will be a nightmare while getting the data from the notes table, as I will have to add where clauses for all the groups that a user will belong to.

Thanks a Ton.

Hi karansoi,

Role-based Access Control is on our 2.0 release roadmap as an enterprise feature; however, row-level access control is not on our near-term roadmap.

For now, I think your best option is to manage this access control entirely in your multi-tenant layer above CockroachDB. Give the application accessing the database (i.e. your multi-tenant layer) a single user account which belongs to the application. The application then authenticates your users and restricts access based on the specific user. You will need to maintain your own user accounts and perform your own authentication, but this option should be able to meet any requirements you have in terms of access restriction.

An alternative might be to create several views over the data tables, where each view is restricted to some subset of the table with a WHERE clause, and then only GRANT restricted users access to those views instead of the main table. This will work if your restricted users are only reading data, but unfortunately it won’t work if users need to write to the tables in a restricted way.

As for the problem of sharing conversations between users, my intuition is that you’re looking for an associative table. Something like:

CREATE TABLE application_users (


CREATE TABLE notes_by_user (
   user_id NOT NULL REFERENCES application_users (id),
   note_id NOT NULL REFERENCES notes (id), 
   PRIMARY KEY (user_id, note_id),
   INDEX (note_id)

Insert a record into notes_by_user for each user associated with a note. To get the notes associated with a user:

SELECT note_id FROM notes_by_user WHERE user_id = 'username'

Depending on your schema, you could also JOIN against the notes_by_user table to directly access other tables indexed by either note_id or user_id.

Matt T

Hey Matt,
Thanks for your quick response.

I actually require or prefer the row to belong to a set of people, something like a role id, or group id, as I want the layer itself to take care of the Row Level Security, and not have the developers using the layer to have to account for that Security.

I did think of another solution that the rows of a table can have an additional column ie… a group id, and as a user belongs to one or more groups and then the layer would automatically add the groups id as a where clause into the query, like

SELECT * FROM TableName WHERE group_id = 1 OR group_id = 2 … OR group_id = n;

I would have the layer automatically add those group ids that the user belongs to in the where clause, but wanted to know is that the best practice or even good practice or should I do it in a different method?
Like if a user is part of a hundred groups, the layer would in turn add a hundred where clauses.
So, wanted to know what would be the max number of where clauses that can be added to the query, as I don’t want it to start making the query run slow.

A better syntactic alternative would be to use an “IN” statement, such as:

SELECT * FROM  TableName WHERE group_id IN (1, 2, ..., n)

In a situation where users are part of multiple groups, and every row of TableName belongs to exactly one group, that might be about as good as you can get.

In terms of performance, the tricky part depends on the structure of the tables; if there are a small number of groups, and each user has access to most of the data in the table, then using the normal index might make sense - the group_id clause will be used to filter out a small number of rows from an otherwise valid data set.

However, if each group only owns a small number of rows, you’ll probably want to index by group_id and force the usage of that index. You can do that with an index hint:

SELECT * FROM TableName@group_id_idx WHERE group_id IN (1, 2, ..., n);

This is a manual tuning step. In the future, CockroachDB may include a query optimizer which will be able to look at statistical data for each table and determine the best index to use, but currently it does not have this ability.

Finally, as one additional option, if you have a table which maps users to groups, you could also express this as a join:

FROM TableName 
  INNER JOIN GroupsByUser 
  ON TableName.group_id = GroupsByUser.group_id
WHERE GroupsByUser.user_id = 1 

Hey Matt,

Thats great, will use the new syntax, also I have already planned to Index the group_id,

Index (group_id)

but wanted to know if there are any upper limit of Group_id I can add to that Where In clause, performance vice.

Hi Karansoi,

There is no explicit limit on the number of elements in that expression.

There may be a practical limit for performance, but I think that’s something you will have to arrive at in your own testing; without knowing more about the number of expected groups or the number of expected rows in the table, it’s hard to speculate on the best strategy.

Hey Matt,

I did a small test, filled a random dataset of 422,000 into one table, and then did a where in query with about 200+ group_ids to filter from, The group_id was indexed, and following are the details of the test.
Did the test on 3 g-1 Small Servers running with 20 GB SSD in GCP.

The Creation of the Tables.


id INT NOT NULL DEFAULT unique_rowid(),
“name” STRING(256) NOT NULL DEFAULT random()::STRING,
second_name STRING(256) NOT NULL DEFAULT random()::STRING,
last_name STRING(256) NOT NULL DEFAULT random()::STRING,
uid UUID NULL DEFAULT uuid_v4()::UUID,
suid UUID NULL DEFAULT uuid_v4()::UUID,
date_time_stamp TIMESTAMP NOT NULL DEFAULT clock_timestamp(),
time_stamp TIMESTAMP NOT NULL DEFAULT clock_timestamp(),
update_time TIMESTAMP NOT NULL DEFAULT clock_timestamp(),
group_id FLOAT NOT NULL DEFAULT round(random(), 4:::INT) * 10000.0:::FLOAT,
INDEX group_id_idx (group_id ASC),
INDEX first_time_stamp_idx (time_stamp ASC),
FAMILY “primary” (id, “name”, second_name, last_name, uid, suid, date_time_stamp, time_stamp, update_time, group_id)

Then Inserted random data into the table, (420,000) of them. :stuck_out_tongue:

INSERT INTO first(“name”, last_name) VALUES (‘Karan’, ‘Soi’);

P99 Latency of the Inserts were 84 ms.

This randomly created about 10,000 distinct groups.
Then I created a select query with about 200+ group_ids in the where in clause.

SELECT id, group_id, time_stamp FROM first WHERE group_id IN (…);

This took approx 19 secs to retrieve 8859 rows.

SELECT id, group_id, time_stamp FROM first WHERE group_id IN (…) LIMIT 1;
This took 29 ms.

SELECT COUNT(*) FROM first WHERE group_id IN (…);

This takes 55 ms to return 8859 Count.

What is exactly going on, should it be taking 19 sec for that query and should I proceed with this as a Row Level Security System for me?
How do I reduce the time of the query, am I doing anything wrong?

Hi karansoi,

After analyzing your query through explain, I think that the index join is about as efficient as you can get without modifying your schema a bit:

root=> EXPLAIN SELECT id, group_id, time_stamp FROM first WHERE group_id IN (1, 2, 3);
 Level |    Type    | Field |                             Description                              
     0 | render     |       | 
     1 | index-join |       | 
     2 | scan       |       | 
     2 |            | table | first@group_id_idx
     2 |            | spans | /1-/1.0000000000000002 /2-/2.0000000000000004 /3-/3.0000000000000004
     2 | scan       |       | 
     2 |            | table | first@“primary”
(7 rows)


This indicates an index join, which first scans the correct IDs from the group_id index, then looks up each ID in the primary table. The second part (lookup in the primary table) is the expensive one; it involves 8859 separate lookups of the primary table; while each individual lookup is acceptably quick, the cost adds up in aggregate. This system would work well for one where a user has access to very few rows in the “first” table, but the performance degrades quickly as the number of rows increases.

A couple of options to speed up performance:

  1. Modify your primary key to (group_id, id), and then add an index on (id).

If you are expecting that the majority of incoming queries are going to include group_id in the where clause, this might be your best bet. Your queries will be able to retrieve all relevant data while only having to look at the primary key, no indexes involved.

The tradeoff is that lookups or updates based only on ID might be a little slower, as they would need to go through the index.

  1. Use a STORING index.

A storing index stores additional fields in an index, mean. In the case of your example query, modifying the group_id_idx to:

INDEX group_id_idx(group_id ASC) STORING (time_stamp)

Would make the lookup much faster, as it would only have to do the index lookup in order to return the data. However, this would only work for queries where all fields are stored in the index.

You would have the option of storing all fields in the index, which means that lookups either by “group_id” or “id” would only need to read one index; however, you would be duplicating your storage requirements and your write I/O requirements.

One additional thing we noticed: your group_id column is a FLOAT; this takes up more space than either an INT or a BIGINT, and it also has consequences for your WHERE IN clause (with an INT data type, subsequent group values can be merged together for more efficient scans of the group_idx).

Hey Matt,

I tried option number 1, ie modifying the primary key, but oddly that didn’t give any noticeable improvements. also storing Index might not be the right solution for my requirement because I cannot do that for all my tables.

I will be definitely storing the group_id as an INT only, only for random generation I was using float, because I was getting some errors while casting.

Also, during the tests I found a problem with the diminishing speed of inserts, as first 400,000 odd inserts, the inserts ran at 400 - 500 QPS, but after that gradually dropped to 70 - 100 QPS. I had built a custom piece of code to insert random values into the database for load tests and Row Level Security, I was running the test on the following server configurations:

  1. Cockroach DB running on 3 g1-small Instances in GCE with 50 GB Storage Each.
  2. Java Tomcat, to run the tests.

I checked the RAM usage was really low, ie. 250 mb, but the CPU usage was high from the very beginning. So, I wanted to figure out what is going on, and why is there just a massive drop in the speed.

In the beginning, the P99 latency per query was below 100 ms, but as it reached to a 700,000 entries in the database it slowed down to 4 to 5 secs for the P99 latency.

Can’t figure out what is going on, so I spun up a mysql server, on the same configuration, only on one instance though, and ran the same test, it ran at 300 - 400 queries per sec right till the end, ie. I ended up running it till it inserted 5 million rows into the table.

Over that I also ran the Group IDs where clause over cockroach, which had about 2 million rows in the table, and it would take about 10 - 15 or even 30 times slower than mysql, which mysql was able to perform in a matter of 10 - 20 ms, cockroach would take about 600 ms when running the same QPS, cockroach would take much lesser time, but still it would take 150 ms.

I also did run another load test last night, the test goes as follows :

  1. I inserted a few queries (100,000) in the “first” table and it was performing at 70 - 100 QPS.
  2. Then I created another table “second”, and inserted 300,000 rows into that table, and it was able to perform at 500 - 600 QPS.

I have sent you the details of the cluster and test server as private message, please advice. Thanks a ton.

Hi Karansoi,

To reiterate from our private discussion:

What concerns me most about your particular issue is the slowdown after inserting a bunch of rows. I’m not aware of a performance issue where entry speed slows down as the database grows; that might be indicative of an issue on our end, but it could also be an issue with your particular schema or your tomcat application.

What would help us figure this out is some stats from cockroach: if you could check out our Admin UI and either describe some of those graphs (or post some screenshots), that would really assist us in figuring out whats going on. What we’d really want to see is various graphs during the slowdown; some places to look for starters:

  • The Replication dashboard, to see if you have a lot of replication traffic occurring after reaching a certain threshold
  • Capacity on the Overview tab
  • Queues tab
  • The storage tab to see if there’s any sort of obvious disk-related thrashing

In terms of comparing benchmarks directly with MySQL, it’s difficult to do an apples-to-apples comparison. There are a number of factors that might tilt the benchmarks in one direction or another, for example:

  • Running CRDB on three nodes vs MySQL on one node; that means MySQL has no replication traffic. This also means that the table on CRDB will have its read leases scattered across all three machines, and this will result in numerous network round trips that aren’t necessary for the single-machine MySQL query.
  • MySQL may be operating at a lower isolation mode than CRDB (the default is Repeatable Read, CRDB default is serializable. RR is significantly faster, but also significantly less safe and can result in consistency errors).

That said, I still think we should be doing better than 30x slower on that query. I was a bit confused about the circumstances, though; does CRDB take 600ms to execute the GroupID query, or 150ms?

Hi Matt,

You can have a look at the cluster data directly at this URL “

IP Addresses of all the Nodes in the Cluster are (Connected using Insecure Mode):
Node 1 :
Node 2 :
Node 3 :

Also, I am running the java test server also, so y’all can hit the APIs and send the params, so y’all can also test the inserts per connection. (Please note, whatever connection number y’all put is multiplied by 3 as there are 3 nodes and I just wrote the code to call 3 times).
API (GET) of the Java Server :
(Please note :
Parameter write_number states number of writes each thread should do.
Parameter conn_number * 3 states number of threads to run).

To Insert into the Table Named First :

To Insert into the Table Named Second :

To Insert into the Table Named Third :

Also, I am adding all the screen shots of the Admin UI related to the test I just ran on a new table, “second” just incase I shut the cluster down. I am attaching 2 sets of Screenshots, one during the first 400,000 inserts and then post that. I initially Inserted 300,000 rows into the table using the Java Tester, over 300 Connections maintained on 300 threads. Then I ran the same test for the Second time, ie 300,000 rows into the table. The Screen shots show the sudden drop in speed, from 550 - 600 QPS to 150 - 200 QPS.

That is true that comparing it to MySQL is unfair, but just wanted to just get more data for a better understanding.

The GroupID Query, takes 600 ms most of the time to execute, but sometimes it also runs at 150 ms, for the same query especially after running it multiple times.

Also all the nodes are running on the alpha 2.0 version.

g1-small on GCE is a shared-core machine. This causes unpredictable performance for cockroachdb. We recommend using machine types that guarantee at least one entire core, such as the n1-standard family.

Hey Ben,

Thanks for your quick response, after reading your recommendation, I went ahead and started up 3 n1-standard Servers in a cluster, I also slapped a load balancer over the 3 servers for a more real world test.

I then ran a test to insert 300,000 rows into a fresh table, it did run at roughly 600 QPS, then I went ahead to insert more into the table, but again at the 450,000 rows, the QPS again dropped to 150 QPS approx.

I am adding the screenshots of the Admin UI below. I am completely confused to what is happening, and where is the problem. Thanks a ton.


Please do find the GCP server stats below .
This is the Cockroach Node 1 CPU Usage.

This is the Java Server CPU usage, of the test server I have built.

This looks like the effect of a range split. You can see it most clearly in the “KV Transactions” graph of the “distributed” tab. Initially, everything is able to commit on the fast path (which is only available for single-range transactions). After that, the “fast path” line drops to zero and commits are much lower than before. This is because the data range for this table has reached 64MB and split. This means that the primary table data and the secondary index are no longer on the same range, so the system has to do a distributed transaction instead of a single-range operation.

This drop-off in performance is expected for any table with a secondary index when it reaches a certain size. However, your starting point seems awfully low - you should be seeing much more than 600 inserts per second before the split, and a proportional change afterwards. Is your client single-threaded? You need multiple concurrent clients to reach the maximum throughput of the cluster.

My client? The java program?

I can set the number of concurrent connections I want to use to connect to the database cluster, and it would create multiple threads, and each thread, then will send an editable number of inserts per connection.

So I was testing it over 500 threads ie. connections and inserting over 1000 rows per connection ie. 500,000 rows inserted at the end of the execution, which you can see in the “SQL Tab”.

Also, I just followed the instructions of installation provided in cockroach docs.

The inserts might be hitting a threshold of 600 QPS, as I had initialized each node with just 50 GB, so may be they are hitting the IO limits set by GCP. I will later test with each node with 500 GB.

Also all the nodes are running SSDs, and I am using the Google Load balancer.

What is the solution to the range split problem?

What is the solution to the range split problem?

This isn’t really a “problem” with a solution, it’s just how things work. Providing transactional consistency with secondary indexes has coordination costs. Secondary indexes slow down writes in order to speed up reads.

We have some docs on performance optimization here.

That does make Sense, thank you for your help.
Just a followup question, if I have a table, that does not have any secondary index, would that have a similar drop in performance after the table reaches a certain size?

No, if there are no secondary indexes then (single-row) insertion performance would remain steady.

Thanks a ton, I did again run a test with a new table, that didn’t have any secondary index, and its insertion speed didn’t reduce as you suggested.

Also, the new test I am running, has the 3 nodes with 500 GB SSD each, but they are giving me 700 - 900 QPS, is that the number of inserts I should be seeing?

I have attached the Distributed Tab of the Cluster.

The Schema of the Table is :
id INT NOT NULL DEFAULT unique_rowid(),
“name” STRING(256) NOT NULL DEFAULT random()::STRING,
second_name STRING(256) NOT NULL DEFAULT random()::STRING,
last_name STRING(256) NOT NULL DEFAULT random()::STRING,
uid UUID NULL DEFAULT uuid_v4()::UUID,
suid UUID NULL DEFAULT uuid_v4()::UUID,
time_stamp TIMESTAMP NOT NULL DEFAULT clock_timestamp(),
update_time TIMESTAMP NOT NULL DEFAULT clock_timestamp(),
group_id FLOAT NOT NULL DEFAULT round(random(), 4:::INT) * 10000.0:::FLOAT,
FAMILY “primary” (id, “name”, second_name, last_name, uid, suid, time_stamp, update_time, group_id)