Group by query returns `incorrectly ordered stream` - v2.1.3

bug

#1

I did a quick search on Google, and the first hit is for incorrectly ordered stream is https://www.cockroachlabs.com/docs/releases/v2.1.1.html – which suggests the error message incorrectly ordered stream is a known failure mode which was fixed in v2.1

# Welcome to the cockroach SQL interface.
# All statements must be terminated by a semicolon.
# To exit: CTRL + D.
#
# Server version: CockroachDB CCL v2.1.3 (x86_64-unknown-linux-gnu, built 2018/12/17 19:15:31, go1.10.3) (same version as client)
# Cluster ID: 5d56db4b-a7ea-4bf7-93a8-97db64ffcaf4
#
# Enter \? for a brief introduction.
#

CRDB:26257/my_db> SELECT node_id, user_name, COUNT(*) AS my_count FROM work_m WHERE project_id = 'PID' AND  decision IN ('accepted', 'rejected') GROUP BY node_id, user_name ORDER BY node_id ASC, user_name ASC;
pq: incorrectly ordered stream ['accept-review-1' 'worker1' 1] after ['reject-review-2b' 'worker2' 9] (ordering: [{0 1} {1 1}])
CRDB:26257/my_db> EXPLAIN (VERBOSE) SELECT node_id, user_name, COUNT(*) AS my_count FROM work_m WHERE project_id = 'PID' AND  decision IN ('accepted', 'rejected') GROUP BY node_id, user_name ORDER BY node_id ASC, user_name ASC;
            tree            |    field    |                                                                                                                       description                                                                                                                       |                  columns                   |      ordering
+---------------------------+-------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------+---------------------+
  group                     |             |                                                                                                                                                                                                                                                         | (node_id, user_name, my_count)            | +node_id,+user_name
   │                        | aggregate 0 | node_id                                                                                                                                                                                                                                                 |                                            |
   │                        | aggregate 1 | user_name                                                                                                                                                                                                                                               |                                            |
   │                        | aggregate 2 | count_rows()                                                                                                                                                                                                                                            |                                            |
   │                        | group by    | @1-@2                                                                                                                                                                                                                                                   |                                            |
   │                        | ordered     | @1-@2                                                                                                                                                                                                                                                   |                                            |
   └── render               |             |                                                                                                                                                                                                                                                         | (node_id, user_name)                       | +@3,+@4
        │                   | render 0    | node_id                                                                                                                                                                                                                                                 |                                            |
        │                   | render 1    | user_name                                                                                                                                                                                                                                               |                                            |
        └── sort            |             |                                                                                                                                                                                                                                                         | (decision, project_id, node_id, user_name) | +node_id,+user_name
             │              | order       | +node_id,+user_name                                                                                                                                                                                                                                     |                                            |
             └── index-join |             |                                                                                                                                                                                                                                                         | (decision, project_id, node_id, user_name) |
                  ├── scan  |             |                                                                                                                                                                                                                                                         | (decision, project_id, node_id, id)        |
                  │         | table       | work_m@work_m_project_id_decision_idx                                                                                                                                                                         |                                            |
                  │         | spans       | /"PID"/"accepted"-/"PID"/"accepted"/PrefixEnd /"PID"/"rejected"-/"PID"/"rejected"/PrefixEnd |                                            |
                  └── scan  |             |                                                                                                                                                                                                                                                         | (decision, project_id, node_id, user_name) |
                            | table       | work_m@primary                                                                                                                                                                                                                     |                                            |
(17 rows)

Time: 41.5493ms

CRDB:26257/my_db>

#2

If I remove the ORDER BY, the query no longer fails! I wonder if this is related to a potentially stale index?


(Radu Berinde) #3

Hi,

I believe this is the same root cause as https://github.com/cockroachdb/cockroach/issues/35209. The giveaway is the invalid +@3,+@4 ordering of the render node; this can cause bad query results or the error you mentioned. The problem should be fixed in 2.1.6 which we plan to release on Monday.

Can you also run EXPLAIN (DISTSQL) on that same statement? Also, if you can post the CREATE TABLE statements for the tables in the query I can verify that it is indeed fixed.

As a workaround, you can turn off the optimizer with SET OPTIMIZER = off (but note that it may cause other queries to run less efficient plans).


#4

Thanks Radu! That’s good to hear.

Here’s the link for the distsql: https://cockroachdb.github.io/distsqlplan/decode.html#eJzMlF1r2zAUhu_3K8q5SkEjlp2v-splVxmdM9KEMUYQrnVwtSU-RpJZw8h_H44DsbNGSdZSehPQx6PznJMX_4GcJMbJCg2EP-AGGHCv-unBgkGhKUVjSFeH9dWxfILQY6DyorT1tlV2iRBCmZOWqFECA4k2UcvqfLFZMEhJI4T7qzF9pKI7OrjIgEq7e3bBwNgkQwj7G9YozRuln3l4ljwscYqJRN31Ws_Db9K_REqrYokWBRUFaVvmyq7FShSafmJqhZJCYqqMolwo-RQ5GGAwKW14FfEAjqnzS9Q_k8p35n7bvNBqlej1eTacRbx_1Mi_xOg2yzRmiSXd7bWNIs6iSvI2_i7iyUzE87u7TsSv_9nyq61Pk3k8E9PJt_vO9YHZvtjD-uoxMY_P1dlGaNdAcLSB_wpio8XBS1q8n38R43jWiYLDDvfqvZa6f36Q-TsL8gn1RpCDNwryCaPGv9x_n0H2XzHIw7cNsuee_RRNQbnBs771XtUaygzrORgqdYpfNaXbMvVysuW2GxKNrU9v6sU4r48qwSbMnbDvhn0nHLjh4BDmTbjXgvll8MgN95zanhvuO-GBu-eBEx664eFLBuaGTwxsdNHAFpsPfwMAAP__kLAMZg==

and the CREATE TABLE:

root@cockroachdb.apixio.com:26257/metric> show create table work_complete_opportunity_m;
          table_name          |                                                                                   create_statement
+-----------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
  work_complete_opportunity_m | CREATE TABLE work_complete_opportunity_m (
                              |     decision STRING NOT NULL,
                              |     phase STRING NOT NULL,
                              |     project_id STRING(63) NOT NULL,
                              |     action STRING NOT NULL,
                              |     overturn STRING NOT NULL,
                              |     patient_id STRING NOT NULL,
                              |     source STRING NOT NULL,
                              |     reject_reason STRING NOT NULL,
                              |     transaction_id STRING NOT NULL,
                              |     code STRING NOT NULL,
                              |     node_id STRING NOT NULL,
                              |     when_measured TIMESTAMPTZ NOT NULL,
                              |     id STRING NOT NULL,
                              |     uorg_id STRING(63) NOT NULL,
                              |     user_name STRING NOT NULL,
                              |     doc_id STRING NOT NULL,
                              |     CONSTRAINT "primary" PRIMARY KEY (id ASC),
                              |     INDEX phase_idx (phase ASC),
                              |     INDEX code_idx (code ASC),
                              |     INDEX when_measured_idx (when_measured ASC),
                              |     INDEX decision_idx (decision ASC),
                              |     INDEX transaction_id_idx (transaction_id ASC),
                              |     INDEX user_name_idx (user_name ASC),
                              |     INDEX overturn_idx (overturn ASC),
                              |     INDEX reject_reason_idx (reject_reason ASC),
                              |     INDEX doc_id_idx (doc_id ASC),
                              |     INDEX node_id_idx (node_id ASC),
                              |     INDEX uorg_id_idx (uorg_id ASC),
                              |     INDEX project_id_idx (project_id ASC),
                              |     INDEX patient_id_idx (patient_id ASC),
                              |     INDEX id_idx (id ASC),
                              |     INDEX action_idx (action ASC),
                              |     INDEX work_complete_opportunity_m_project_id_idx (project_id ASC) STORING (phase, overturn, decision, node_id),
                              |     INDEX work_complete_opportunity_m_project_id_node_id_when_measured_decision_idx (project_id ASC, node_id ASC, when_measured ASC, decision ASC) STORING (user_name),
                              |     INDEX work_complete_opportunity_m_project_id_decision_idx (project_id ASC, decision ASC) STORING (reject_reason, node_id),
                              |     FAMILY "primary" (decision, phase, project_id, action, overturn, patient_id, source, reject_reason, transaction_id, code, node_id, when_measured, id, uorg_id, user_name, doc_id)
                              | )
(1 row)

Time: 48.5618ms

(Radu Berinde) #5

Thank you. I verified that this is fixed in 2.1.6. For the record, here is the plan with 2.1.5 (similar to yours): https://goo.gl/HQVKLW and this is with 2.1.6 (the difference is the “ordered” input in the second aggregation stage): https://goo.gl/tcGqJ5