Search for a string with a given prefix

(Ivan Dubrov) #1

Given that LIKE with ESCAPE is not currently optimized via index for prefix searches (https://github.com/cockroachdb/cockroach/issues/30192), what is the best way to search for a key with a given prefix in case prefix is only known at runtime?

I us LIKE against prefix + ‘%’, but I’m not able to escape ‘_’ and ‘%’ in the prefix itself, which makes this search incorrect. LEFT(key, 3) = 'abc' is also not optimized.

I could do something like key >= <prefix> AND key < <next of prefix>, where <next of prefix> would be upper bound for the range (for example, it will be ‘abd’ for prefix ‘abc’), but computing such string could be error-prone (especially, given UTF-8 – not sure what’s the correct way to do it!).

2 Likes

(Ron Arévalo) #2

Hey @idubrov,

Sorry for the delay in response to this, could you elaborate a bit more on what you’re trying to accomplish?

I’m not clear on what you mean when you give your examples.

Thanks,

Ron

0 Likes

(Ivan Dubrov) #3

I have a table, say CREATE TABLE test(key STRING PRIMARY KEY, value STRING)

I need to do a “prefix” search based on the key, in other words, find all the rows with the key starting with a given string. Prefix is given as a parameter to my function.

0 Likes

(Ron Arévalo) #4

Hey @idubrov,

I spoke with our SQL Execution team, and they agreed that using key >= <prefix> AND key < <next of prefix> would be the best course at the moment. However you would need to define the next function, here is how we have done it on our end:

Let me know if you have any other questions.

Thanks,

Ron

0 Likes

(Ivan Dubrov) #5

Yes, that’s pretty much what I ended up with. One thing, though, is that I work on the client side where I have strings, not bytes. This algorithm can potentially generate an invalid UTF-8 sequence, but seems like CockroachDB works just fine.

0 Likes