Slow query using time ranges

We have the following table we schedule jobs into:

root@10.138.255.58:26257/workers> show create table workers.tasks;
       table_name      |                                                create_statement
+----------------------+----------------------------------------------------------------------------------------------------------------+
  workers.public.tasks | CREATE TABLE tasks (
                       |     ulid STRING NOT NULL,
                       |     data JSONB NULL,
                       |     created_at TIMESTAMP NULL,
                       |     completed_at TIMESTAMP NULL,
                       |     company_uuid STRING NULL DEFAULT '':::STRING,
                       |     task_type STRING NULL DEFAULT '':::STRING,
                       |     scheduled_at TIMESTAMP NULL,
                       |     started_at TIMESTAMP NULL,
                       |     CONSTRAINT "primary" PRIMARY KEY (ulid ASC),
                       |     INDEX created_at_idx (created_at DESC),
                       |     INDEX completed_at_idx (completed_at DESC),
                       |     INVERTED INDEX detail_idx (data),
                       |     INDEX company_tasks (company_uuid DESC),
                       |     INDEX scheduled_at_idx (scheduled_at DESC),
                       |     INDEX started_at_idx (started_at DESC),
                       |     FAMILY "primary" (ulid, data, created_at, completed_at, company_uuid, task_type, scheduled_at, started_at)
                       | )
(1 row)

Time: 20.56551ms

When scanning this table for jobs that should be executed in the next 10s the query time is a lot slower than I’d expect:

root@10.138.255.58:26257/workers> SELECT ulid, data, company_uuid, task_type FROM workers.tasks WHERE started_at IS NULL AND scheduled_at < NOW() + INTERVAL '10 SECONDS' ORDER BY scheduled_at DESC;
  ulid | data | company_uuid | task_type
+------+------+--------------+-----------+
(0 rows)

Time: 4.056390698s

root@10.138.255.58:26257/workers> select count(ulid) from workers.tasks;                                                                                                                                                                                                          count
+--------+
  108231
(1 row)

Time: 187.548958ms

I’ve had a look at the explain for this but I can’t make sense of it:

root@10.138.255.58:26257/workers> EXPLAIN SELECT ulid, data, company_uuid, task_type FROM workers.tasks WHERE started_at IS NULL AND scheduled_at < NOW() + INTERVAL '10 SECONDS' ORDER BY scheduled_at DESC;
               tree              | field  |          description
+--------------------------------+--------+--------------------------------+
  render                         |        |
   └── sort                      |        |
        │                        | order  | -scheduled_at
        └── render               |        |
             └── filter          |        |
                  │              | filter | scheduled_at < (now() + '10s')
                  └── index-join |        |
                       ├── scan  |        |
                       │         | table  | tasks@started_at_idx
                       │         | spans  | /NULL-
                       └── scan  |        |
                                 | table  | tasks@primary
(12 rows)

Time: 4.07084ms

Can you possibly help me decode the explain to better understand how this query is being planned/executed to try and find the problem? Or, is this simply incorrect use of indexes with cockroachdb?

Hey @charl,

Can you send us over the EXPLAIN (OPT, ENV) so we can take a look at all details used by the optimizer, including statistics.

Thanks,

Ron

Hi @ronarev,

We are running v2.1.2 that does nto appear to support the ENV part of that:

# cockroach version
Build Tag:    v2.1.2
Build Time:   2018/12/10 20:15:47
Distribution: CCL
Platform:     linux amd64 (x86_64-unknown-linux-gnu)
Go Version:   go1.10.3
C Compiler:   gcc 6.3.0
Build SHA-1:  8d807ce49978c46ed4daa03e6dbca5b2b5acfcc1
Build Type:   release

Without it:

> EXPLAIN (OPT) SELECT ulid, data, company_uuid, task_type                                                                                                                                                                                                 FROM workers.tasks                                                                                                                                                                                                                                                              WHERE started_at IS NULL                                                                                                                                                                                                                                                        AND scheduled_at < NOW() + INTERVAL '10 SECONDS'                                                                                                                                                                                                                     ORDER BY scheduled_at DESC;
                                                                             text
+-------------------------------------------------------------------------------------------------------------------------------------------------------------+
  sort
   ├── columns: ulid:1(string!null) data:2(jsonb) company_uuid:5(string) task_type:6(string)
   ├── side-effects
   ├── stats: [rows=3.33333333]
   ├── cost: 52.2157977
   ├── key: (1)
   ├── fd: (1)-->(2,5-7)
   ├── ordering: -7
   ├── prune: (1,2,5-7)
   └── project
        ├── columns: ulid:1(string!null) data:2(jsonb) company_uuid:5(string) task_type:6(string) scheduled_at:7(timestamp!null)
        ├── side-effects
        ├── stats: [rows=3.33333333]
        ├── cost: 52.0333333
        ├── key: (1)
        ├── fd: (1)-->(2,5-7)
        ├── prune: (1,2,5-7)
        └── select
             ├── columns: ulid:1(string!null) data:2(jsonb) company_uuid:5(string) task_type:6(string) scheduled_at:7(timestamp!null) started_at:8(timestamp)
             ├── side-effects
             ├── stats: [rows=3.33333333, distinct(8)=1]
             ├── cost: 52
             ├── key: (1)
             ├── fd: ()-->(8), (1)-->(2,5-7)
             ├── index-join tasks
             │    ├── columns: ulid:1(string!null) data:2(jsonb) company_uuid:5(string) task_type:6(string) scheduled_at:7(timestamp) started_at:8(timestamp)
             │    ├── stats: [rows=10]
             │    ├── cost: 51.9
             │    ├── key: (1)
             │    ├── fd: ()-->(8), (1)-->(2,5-7)
             │    └── scan tasks@started_at_idx
             │         ├── columns: ulid:1(string!null) started_at:8(timestamp)
             │         ├── constraint: /-8/1: [/NULL - /NULL]
             │         ├── stats: [rows=10, distinct(8)=1]
             │         ├── cost: 10.4
             │         ├── key: (1)
             │         └── fd: ()-->(8)
             └── filters [type=bool, outer=(7), side-effects, constraints=(/7: (/NULL - ])]
                  └── lt [type=bool, outer=(7), side-effects, constraints=(/7: (/NULL - ])]
                       ├── variable: scheduled_at [type=timestamp, outer=(7)]
                       └── plus [type=timestamptz, side-effects]
                            ├── function: now [type=timestamptz, side-effects]
                            └── const: '10s' [type=interval]
(43 rows)

Time: 8.296088ms

Any pointers on this?

Hey @charl,

Sorry for the delay, we hope to get back to you later today.

Thanks,

Ron

Hey @charl,

It looks like you’re running into this bug. Basically the now() function isn’t being evaluated before index selection and is doing a full table scan. I would suggest trying to use a constant that is generated on your end by your app instead of the now function, this should increase performance on this query. We should then still choose the correct index for the query.

Let me know if that doesn’t work for you, or if the performance doesn’t improve significantly.

Thanks,

Ron