ORDER BY and GROUP BY clauses on JSONB data


#1

Hello crdb friends,

I tested the following statements on crdb and received error messages:

root@:26257/defaultdb> create table test4 (id int not null primary key, data jsonb not null);
CREATE TABLE

Time: 34.379874ms

root@:26257/defaultdb> insert into test4 values (1, '{"a": {"b": {"c": [1,2,3]}}}'),(2, '{"x": {"y": {"z": [1,3,5]}}}'),(3, '{"a": {"b": {"c": [1,4,5]}}}');
INSERT 3

Time: 21.268096ms

root@:26257/defaultdb> select * from test4;
  id |              data
+----+--------------------------------+
   1 | {"a": {"b": {"c": [1, 2, 3]}}}
   2 | {"x": {"y": {"z": [1, 3, 5]}}}
   3 | {"a": {"b": {"c": [1, 4, 5]}}}
(3 rows)

root@:26257/defaultdb> select data->'a'->'b'->'c'->0 firstel from test4;
  firstel
+---------+
        1
  NULL
        1
(3 rows)

root@:26257/defaultdb> select data->'a'->'b'->'c'->0 firstel, count(*) freq from test4 group by data->'a'->'b'->'c'->0 order by 1;
pq: can't order by column type jsonb
root@:26257/defaultdb> select data->'a'->'b'->'c'->0 firstel, count(*) freq from test4 group by data->'a'->'b'->'c'->0;
pq: unable to encode table key: *tree.DJSON

But they executed as expected in PG.

Is this simply a yet supported feature or is it more of a bug?

Thanks.


(Ron Arévalo) #2

Hey @antonius.lin,

We do support this feature, you would need to use the ->> operator. Your queries should read as follow:

root@:26257/defaultdb> select data->'a'->'b'->'c'->>0 firstel, count(*) freq from test4 group by data->'a'->'b'->'c'->>0 order by 1;
  firstel | freq
+---------+------+
  NULL    |    1
  1       |    2
(2 rows)

The same goes for the second query, changing the operator will return values. When using a single arrow operator a json value is returned, the double arrow operator returns a string which allow you to order it. You can read more about the different operators here.

Let me know if you have any questions.

Thanks,

Ron


(Ben Darnell) #3

Note that ->> returns a string and uses string comparison rules (so "10" < "2"). If you want numeric ordering, you need to cast this to a float: (data->'a'->'b'->'c'->>0)::float


#4

Hi @ronarev, @bdarnell,

Thanks for your quick response
Confirming that it works as indicated.
So basically, I’d have to

root@:26257/defaultdb> select data->'a'->'b'->'c'->>0 firstel, count(*) freq from test4 group by data->'a'->'b'->'c'->>0;
  firstel | freq
+---------+------+
  1       |    2
  NULL    |    1
(2 rows)

Time: 3.481477ms

root@:26257/defaultdb> select data->'a'->'b'->'c'->>0 firstel, count(*) freq from test4 group by data->'a'->'b'->'c'->>0 order by 1;
  firstel | freq
+---------+------+
  NULL    |    1
  1       |    2
(2 rows)

Time: 1.677026ms

And yes while I know of the operator ->> (from PG-days), at the moment, I’m feeling like I’m going about this not exactly the right way. Basically, what I’ve been trying to do is port things from PG to CRDB and found certain things that worked in PG, ran into hiccups when in the new CRDB environment. While understanding that PG is very mature, in contrast to CRDB is young, new and exciting (for me), I’m thinking if maybe there is perhaps a better approach I should consider to eventually fully migrate out of PG?

Thanks much again for the tips!